International Series in Pure and Applied Mathematics 
WiutuiaM TEep Martin, Consulting Editor 


ELEMENTS OF PURE AND APPLIED MATHEMATICS 


International Series in Pure and Applied Mathematics 
WituiaAM Tep Martin, Consulting Editor 


AHLFoRS - Complex Analysis 

BELLMAN - Stability Theory of Differential Equations 

Buck - Advanced Calculus 

CopDINGTON AND Levinson : Theory of Ordinary Differential Equations 
GoLoms & SHanks : Elements of Ordinary Differential Equations 
Graves : The Theory of Functions of Real Variables 

GRIFFIN - Elementary Theory of Numbers 

HILDEBRAND - Introduction to Numerical Analysis 

HovusEHOLDER - Principles of Numerical Analysis 

Lass - Elements of Pure and Applied Mathematics 

Lass - Vector and Tensor Analysis 

Leiauton - An Introduction to the Theory of Differential Equations 
Newari * Conformal Mapping 

NEWELL - Vector Analysis 

Rosser - Logic for Mathematicians 

Ruovin - Principles of Mathematical Analysis 

SNEDDON - Elements of Partial Differential Equations 

SNEDDON - Fourier Transforms 

STo.. - Linear Algebra and Matrix Theory 

Weinstock Calculus of Variations 


ELEMENTS OF 
PURE AND APPLIED 
MATHEMATICS 


HARRY LASS 


CONTENTS 


Preface . 
1. Linear Equations, Determinants, and Matrices 
2. Vector Analysis 
3. Tensor Analysis 
+. Complex-variable Theory 
5. Differential Equations 


6. Orthogonal Polynomials, Fourier Series, and Fourier Integrals . 


7. The Stieltjes Integral, Laplace Transform, and Calculus of Variations. 


8. Group Theory and Algebraic Equations 
9. Probability Theory and Statistics . 

10. Real-variable Theory. 

Index 


Vi 


a7 

5! 
122 
172 
2of 
até 
28 
aad 
aid 


481 


CHAPTER 1 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 


L.1. Introduction. The Summation Convention. In much of the 
material of this chapter we shall find it expedient to adapt a summation 
convention first introduced by A. Einstein. Let us consider first the 
set of linear equations 

ait + bay + e2 = ay 
dae + bay + coz = de (1.1) 
G3st + bay + cgz = ay 


il 


I 


We shall find it to our advantage to set x = 2’, y = «*, z2 = 2". The 
superscripts do net denote powers but are simply a means for distinguish- 
ing between the three quantities 2, y, and z. One immediate advantage 
is obvious. If we were dealing with 29 variables, it would be foolish 
to use 29 different letters, one letter for each variable. Thesimegle letter x 
with a set of superscripts ranging from 1 to 29 would suffice to yield the 
29 variables, written z', 77, 2°, ... , 2. Our reason for using super- 
scripts rather than subscripts will soon become evident. Equations (1.1) 
can now be written 

Qyc' + O42? + c1e* = dy 

aga! + bor? + esx? = ads (1.2) 

aaa! + bax? 4+- cyr? = dy 


Equations (1.2) still leave something to be desired, for if there were 
29 such equations, our patience would be exhausted in trying to deal 
with the coefheients of x', 7*, 27). ..,2. Let us note that in (1.2) 
the coefficients of «!, 2°, 2? may be expressed by the square array 


(Ey by C1 
| (Le fhe (T (1,3) 
; @y bs ty 


By defining dy = fi, by = hig, C] = Gia, Oy = a1, bs = 32, C2: = 3, 
@; = dy, 5 = Gaz, Cs = daz, the scuare array (1.3) becomes 


a1 hie ais | 
(a1 fa: G93 
@a) Gag G33. 


L 





(1.4) 





2 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
One advantage is immediately evident. The single element aj lies in 
the ith row and jth column of the square array (1.4). Equations (1.1) 
can now be written 
iit! + ayer? + aygr? = di 
lox! ai Geot* + flay” = ds (1.5) 
gia? -- dager” 4- a3,0* = dy 


Using the familiar summation notation of mathematics, we rewrite (1.5) 


AS 
a i a 
\ te ay ; Gat” = y 4%" = ds (1.63 
cat fm) eed 
ra 
or > nace ds Ew LBB (1.7) 
r=i 
The system of equations 
a 
Y apt mide be 1, BB. yt (1.8) 
r= ] 


represents m linear equations. 
A. Einstein noticed that it was superfluous to carry along the © sign 
in (1.8). We may rewrite (1.8) as 


tix” = dd; te Dy Qe acct (1.9) 


provided it ig understood that whenever an index occurs exactly once 
both as a subscript and superscript a summation is indicated for this 
index over its full range of definition. In (1.9) the index r occurs both 
as a subscript (in a;,) and as a superscript (in x"), so that we sum on ?r 
from r=1tor=nx. In o four-dimensional space (2! = x, xz? = y, 
z? = zg, x* = el) summation indices range from 1 to 4. ‘The index of 
summation is a dummy index since the final result ig independent of the 
letter used. We can write 


Ait” = at! = ait (1.10) 


We may also write (1.9) as 


aig’ = df! $k ios se eo (1.11) 
where the element ai belongs to the 7th row and jth column of the square 
array 

os as 

SU 
Pes we ae (1.12) 
a? a 





LINEAR EQUATIONS, DETERMINANTS, AND MATRICES o 


Frample 1.1, Let us consider the quantity S = a,gr%r", for a three-dimensional 
apace. Since the index « occurs as both a subscript and a superscript, we sum on w 
from 1 to 3. This yields S = a,gr'x? 4+ a.ar*x? + ayerict. Now each term of 8 is 
such that § is both a subscript and superscript. Summing on # from 1 to 3 ag pre- 
acribed by our summation convention yields the quadratie form 


S =  @yrle! + ayrts? + guar 
+a r*e!? + dast*x* + Goyrt*r 
feetattc! -- agertz? + dagrirt (1.13) 


Why is dearts? = ayjetx! = ae? What does deer"? mean? 
Ezample 1.2. If 2, 27, 73, ... , ¢* is a set of independent variables, then 


Bg I OP cee, ag ce 
az) at oa ax” 
lL ai | 
= =O, — =Qifistj. We may write 
—_—— pe At Fg . (1,14 
ari a; = {) if 2 m4 } 


The symbol a ie called the Kronecker! delta. We have 
= +8+--+ +i en 


Let us now assume that the quacratic form (1.13) vanishes identically for all values 
of the independent variables 21, 27, 2, the aj; to be constant. Differentiating 
S = degtts” = 0 with respect to a given variable, say x’, yields 


as can ax 
omen Gage" 5; + dapah > = O 


dx* ee Bape ax’ 
= dage*6? + dapr®? = 0 
= agar? + agr! os 0 Why? 


Now differentiating with respect to x’ yields 


as 


ant agi = Gai8y + aid; = 0 


so that Est + diz = 0 or Ly = (hits i,j = l, 2, ot. 
Ezample 1.4. We define e’, 2,7 = 1, 2, to have the following numerical values: Dect 
ell me pti eA = 1, F! = —], We now consider the expression 


D = éiala’ (1.15) 
Expanding (1.15) by use of our summation convention yields 
D = ealal + elatay + eajal + ajay = aja; — aja 
The reader who is familiar with acconcd-order determinants quickly recognizes that 


] 
5 


fat a} 


1 ek 
: fy, ey! 


(1.16) 


‘For Leopold Kronecker (1825-1891), a world-famous German mathematician, 


4 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Example 1.4. The system of equations 


yisscg tage ee ye 
@ as yttel et... , 2 “ 
Oe tce fh cmon text (1.17) 
gr eae gmat ee a) 


represents a coordinate trumsformation from an (21, 2... . , 2") coordinate system 
toa Gy, ye, ,¥") coordinate system, From the caleulus we have 


Me! 





ch a aye . 
dys = 2 ae + OE ant + i=1,2,...,% 
pa oe ' 
a i h . : 
The « in the term 5a is to be considered as a subseript. If, furthermore, the 2°, 


i=1,2,... ,", can be solved for the y', y*, .. . , #", and assuming differentiability 
of the 2«' with respect to each y!, one obtains 


ay" ; _ ay! axe 
yi Ob Get ayyl 


Differentiating this expression with reapect to y* yields 


Ae eg ee ae a 
de® at ay! ° dxP axr® dy* ay 





Multipiying both sides of this equation by iyi and summing on the index z yields 











g = OF! oy! dee ay! axl dre ga? 
ay? dx? ay* dy? dx? dat ay* dyt dy* 
Fae ati" =} nll ac! ast 
Or (} st A” iF) f 


= aye ag! + aa on Be Oy? By 
which yields 
dA27 dey dx® axe on? 


= 2 Smo se — — 


ay* ayi az! ac® ay! ay* dyé 





In particular, if y = f(x), then 


de dy fdr 
dy? de® \edy 
Problems 


1. Tf yf = aye, zt = boy", show that 2 = boagee 
2. Tf disyetziz® = 0, aij, — constant, show that 


Gije TP Geigy T Ojnt + Gyre TP Gage TP Giang = G 


ay* age . 
ar® ay OF 





8. Show that 


4. lf A* = = al , show that A* = -— A@, 
seer. A RE oe PUT 
6. If fae = gus aye aye” show that ged = Gur ae od 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES a 


6. If the gp, of Prob. 5 are such that gp. = gop, show that gag = Gea. 
7. We define &* as follows: The superscripts 7, 7, § are to take on the numerical 
values 1, 2, 4, not respectively, and we define «!8 = ¢!1% a ¢741 = ], 


geld = g@21 = git a = | 


all other &/* = 0, Expand «/ajaza;, and express this sum as a third-order deter- 
riimant, 

B&B Ife = plz, 27,2. .,3%, faa, 24 a) ee tly), t= 1S, . ey 
end if @ = ety), yt)... ow) = ele (yy), oy)... , otty)], show that 


a6 _ de az? 
dy® dat ay 
oe 


GIVEN ¢a = - 


ana? show that 


08 _ 83; _ (ee _ des dx% dz! 
ay? dy ax ga® sy dy" ay? 
9. If Ges = — dae, show that aear*s” = 0). 


1.2. Determinants. In attempting to selve (1.5) for the x!, x, 2? one 
is led in a very natural way to consider the square array 


1 1 
fl; Gy Os 
|g i a 
G& @ a; (1.18) 


We have written a, = a3, 7,7 = 1, 2,3. The solution of (1.5) requires 
that we attach a numerical value to the matrix (as yet undefined) of 
elements (1.18). We do this in the following way (see Prob. 7, Sec. 1.1): 
We attach 3* = 27 numerical values to a set of e*, 7, 7, k = 1, 2, 4. 
Ii at least two of the superscripts in e** are the same, the value of e* is 
zero. Thus e?7?§ = gt = gi = (9, ete. Ti the 7, 7, & are all different, 
the value of e** is to be +1 or —1 according to whether it takes an even 
or odd number of permutations to rearrange the ijk into the natural 
order 125. Let us look at e*! and hence at the arrangement 321. Per- 
muting the integers 2 and 3 permutes 321 into 231, then permuting 3 and 1 
permutes 231 into 213, and finally 213 permutes into 123 if we inter- 
change the integers 2and 1. Three (an odd number) permutations were 
required to permute 321 into 128. Thus 2! = —], We have 


elt o_ g212 — —281 — 4] 


21e g2el = -132 ek 


é E 
We now define 
ay Ga: Oe 
la; a; a;| = éalatae (1.19) 
di Gs a 


The letters 7, 7, & are indices of summation. Equation (1.19) defines the 


6 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


determinant of the square matrix of clements (1.18). Its numerical 
value is given by the right-hand side of (1.19). It consists, in general, 
of 3! = 3-2-1 = 6 terms, each term a product of three elements, one 
element from each row and column of (1.19). For the benefit of the stu- 
dent we expand (1.19) and obtain 


li 


init 3 lated 1 alate 

(ajasai + ajaja’ + ajaza}) 
Lay tt via? | ALe@ 

— (aja2a? + alaza? + alaiad) 


ide re hoy Ze 
er ch Uh Seep 


Only 3! = 6 terms occur in the expansion of (1.19) since there are 3! 
permutations of 123. All other values of e* are zero. 

We can define ei, in exactly the same manner in which the e?* were 
defined. We leave it to the reader to show that 


eaiaias = ejattayay 


The generalization of second- and third-order determinants (the order 
of a determinant is the number of rows or columns of the determinant) to 
nth order determinants is simple, We define the «---® to have the 
following numerical values: "= = 0 if at least two of the super- 
scripts are the same. ‘The values of the superscripts range from 1 to 7. 
li the 14, t2, ... , t are distinet, the value of &""'* is to be +1 or 
~1 depending on whether an even or odd mumber of permutations is 
required to rearrange 7,)f2 '* * 7, into the natural order 123 -- + n, 
The numerical value (determinant) of the square array of elements 
lail,@g=1,2,... ,m, is defined as 


| j 
ay 1p “ae F an 
a 2 Eases. i 4 
lea | == (hy the Chay 
Jat af «+ at 
cee oe El 
— gtlin - ""h; ly, 1 -— "s Oh; 
SS Eiiy ete °° * Oe (1.20) 


;, are defined in precisely the same manner in which the 
eis are defined. In general, (1.20) consists of n! terms, each term a 
product of elements, one element from each row and column of ja’). 

To facilitate writing, we shall deal with third-order determinants, but 
it will be obvious to the reader that any theorem derived for third-order 
determinants will apply to determinants of any finite order. Let us 
consider 


where the e,;....; 





cr ay ae] 
HA 2 o es : | 
A=|a, @, 4a;| = itqlata; = ea aia’ (1.21) 
a 4 
@) a3 ay) 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 7 


We can obtain a new third-order determinant by interchanging the first 
and third row of A. This yields 


a as as 
oJ = a - 
A’= lat ay a;| = e*ajaja, (1.22) 


“4 1 l 
fF fh 


But eta3atal = eitalaza? = eialaiag, since t and k are dummy indices. 
We see that every term of (1.22) 1s the same as every term in (1.21) 
with the exception that ¢&" is replaced by «** Since e#* = -e*, we 
conelude that A = —A’ We thus obtain the following theorems: 

THeorenM 1.1. Interchanging two rows (or columns) of a determinant 
changes the sign of the determinant. 

Tueorem 1.2. If two rows (or columns) of a determinant are the 
same, the value of the determinant is zero. 

We note that 


lat jai lad 
AY’i=|@i af ai| = d*(lalaia? = 1A (1.23 
= 1 ils ths == € Ke, pity ; a) 
4 5 
Gy a te | 


THEOREM 1.3. If a row (or column) of a determinant is multiphed by 
a factor J, the value of the determinant is thereby multiplied by 1. 
Let us now investigate the determinant 


| 
| ai +ial as +lap a+ let 
2 


PRP Z z 
A I a flo ay 
ql 3 

(Ly (Es Os 


iB ind a 2.3 
= * (a: + laj)aja, 
ea aOh — le alata, 
a A (1.24) 


I 


since e*alafaz = O from Theorem (1.2). Hence we obtain the following 
theorem: 

THEOREM 1.4. The value of a determinant remains unchanged if to 
the elements of any row (or column} is added a sealar multiple of the 
corresponding elements of another row (or column). 

The theorems derived above are very useful in evaluating a determi- 
nant. 


Heample 1.5 


0 -2 2 1 0 -2 38 1 0 -2 8 
—2 6 2/_,| 2-1 38 i} ,|/ 0 -1 7 =5 
3 0 ff -1 38 O 1 -1 38 0 1 
0 0 -1, 30 0 0 =I | 3 0 6 =-1 





5 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


1 Oo —2 q I 0 =-2 4 | }1 ji —2 3 | 
oe a ee oe ee ee | ee ce 
A=2'5 3 ~9 «al=4lo 3 -2 4|=4%lo oo a9 -11 
is pS oO o 6 ~10 0 OO 6 =10 
it oO —2 3 fl 60 (2 3 
Goa OR ac ‘Got OF OE 
=2-:199 0 1 ~—ttl/=2-19,0 909 1 —H 
| 19 | 19 
124 
0 0 6 10) 0 0 0 —— 
| 124\ 
=2-19-1-(—D1-(—F5) = 248 


The final result is obtained as follows: (1) Factor 2 from row 2; (2) multiply row 1 
by —2, and add to row 2; (3) add row 1 to row 3; (4) multiply row 1 by 3, and subtract 
from row 4; (6) multiply row 2 by 5, and add to row 3; (6) factor out 10 from row 3; 
(7) multiply row 3 by —6, and add to row 4: (8} with zeros below the main diagonal, 
the value of the determinant is the product of the diagonal terms, Why? 


‘The reader 1s urged to read Laplace’s method of expanding a determi- 
nant, which can be found in various texts dealing with determinants. 


Example 1.6. If the at of (1.21) are differentiable functions of 9 variable x, then 


da. do 
ia ae (e*aracay) 


tn ql 
=.3 | ta 12 Oy 


—_ ct ik da; (Fst “— elit! aie a + eliky Ll — 
Sa pe a: +" y 
ae 2" "dh * «dst 


Mt 3 ij i 3 a 
1.2 a 
= =/a co io f-l|— — —i+tla, a «a 
ax ar dx adr | 
a: io cl hs ae {i | dai das day 
= 4 2 
: : : | dz dx «dz 





We let the reader extend this result to determimants of order n. 

We now investigate the sum S = eata?ay. Hf oe, 8, y take on the 
values 1, 3, 5, respectively, the sum S is the determinant of the elements 
‘a|, If «, 8, y take on the values 1, 2, 3, but not respectively, then 
S = jai} or S = —|a}| depending on whether it requires an even or odd 
number of permutations to permute a, 8, + into 1, 2, 3 (see Theorem 1.1), 
In all other casea S = Ofrom Theorem 1.2. Hence we obtam the useful 
identity 





agajay = jajle* it) 
Similarly enaajga® = |a'leasy (1.26) 


We now obtain a method for multiplying two determinants of the 
same order. We have from (1,21), (1.26) 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES q 


lay): [DH = jatjenaybsb8by 
= 6; ajc DID SbY 
= eje(@Lb?) (agb2) (afay) 
= equcicicy = |e} ae) 


. at tt ae | 
where cl = a40% r,s = 1, 2, 3. 


Ezample 1.7 











a; @) [bt be] _ |aibt + a@gby ab) + ald; | 

at at) [Us bf| [ait + albt atbl + ats 
2 0 -1) a oO 0 
-1 3 0). 1 1 3 | 
| 8] GO m-t 1 


2(—2) + O12) +(—1)(0) 200)-4+001}4+¢-1)(—1) 20) + 0-3) + (-1)0) 
—1(—2) +201) +00) —100) +201) +O(-1) —100) + 2(—28) + 001) 
o(—2) + 1CE) +10) 3tO) + tt) + 1-71) 3(0) + 1-3} + 10) 





mg: 2 et 
ae 4 2 —6| = 44 
[-§ O +2 





Hrample 1.5, We define the Kronecker é' as follows [see (1,14)|: 


sf = 0 ees (128) 


l ift=j 


This §; = 8 = é) = é)=1, 6, = 6) = 6, = 0. The determinant | a is the 
determinant 


ll 
— 


|é| = AJ =], 2; 3. 


Coe? Set fmt 
ao = & 


0 
() 
] 





We have 
[55] = fagez] = lad] = [65] - a¥] 
We now expand a determinant in terms of its cofactors, We have 


ijk, 1,254 


A = ex =e oyu, i; 
cs alte’ ‘kava 4 aglei*atap | + asfetieg “ay 
=pla! + ald? + ota? (1.29) 


We now examine the sum Ay = e'g'a?. The only terms which contribute to this 
sum are the terms for which j, & take on the values 2, 3, not necessarily respectively, 
for if j or & is set equal to 1, the value of e'* is zero, We ace immediately that 





2 8] 
tI ts, 
Al = aitgtgt =| 2 

} 4, 3 | 


We call Aj the cofactor of the element at. It is the determinant obtained from the 
elements of |!a?/| by removing the row and column containing the element a}. Similarly 


lay ay 


a= f a |] }1+3 
a al t 1) 


af » 
a? 
Oy | 








Al = “ala! = — 


10 ELEMENTS OF PURE AND APIFLIED MATILEMATICS 


The determinant obtained from the elements of lla’ | by removing the row and column 
containing the element a is called the minor, M?, of this element. The cofactor A’ 
of the element a, is such that AB = (—1T)" eM. We see that A = al. Similarly 
A= a, Ay =a, Af. We note that Aj is the cofactor of the element aij, 

Let us now look at the sum @,A®. Here we have 


A= a AF = ajleé aap) + apgle™azay) + ajle ajay) 
= qa, = 0 
since two rows of A are the same. 
In general, a, Af = 0ifit #j. Using this result along with (1.29) yielda 
a, At = jay 8) (1.305 
at AL = [at|8 


Erample 1.9 


;2 1 a s 
io 4 -2 =2}' el - (Hy 31 +3)% e) owgg 
ee | ‘i a 1 2 as 





Hrample 1.10. We can now solve the system of equations ajrc* = bY, ¢ = 1, 2, 

.,n for the unknowns z!, #3, ... , 2" provided ja] #0. Multiplying atx = b 

by Afb and summing on the index i? yields a) Air® = 6°44, so that jal éie*® = bt At from 

(1.30). This in turn implies zi = b'At/la', 7 = 1,2, ...,2, if le] #0. The sum 

b'A‘ is also adeterminant. It is that determinant obtained by replacing the elements 

of the 7th column of las} by the elements 64,7 = 1,2, ...,2n. This ta Cramer's rule. 
For the system of linear equations 


oe — Sy + ¢ = 10 


1 i 
r—-yoae=0 





we have 
‘4 —3 1 | 2 #10 1 ‘2-3 10 | 
4 l I | aa | L! l i 4 
ee me ee re 
; | 2 -3 i1|°  —8 —§ —f 
Pi i ft 
1 1 —1, 
Problems 
at ai a} 
i. What is the cofactor of each element of ! a; @&, Gs|? 
a ay a 
2. From aA? = ala show that |A*| = [a|*~7, 
o. If gi; = gap wae ay (see Example 1.4), show that |g] = et 








dla’ =? 


4. lf at = a'(r), show that — = AZ (see Example 1.6). 


6. Evaluate 


| - <i 4 7 
| 1 yo —-2 a 4 

—} 1 0 Y 3 
; 0 i) S awd. § 
| Z } —! 0 0 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 1] 


6. If Ai = Be ee "show that {A‘| = [Bi 
1. Prove that 
[i f I | 
a ob =(a—bj(b — cle — a) 


fe ooo ah 


jb) oa od] 
Hint: The equation « ob e¢ | = 0 is a quadratic which vanishes for ¢ = 6 and 
| oc oer be | 
- = 6, 
8. Show that 
t =2| k —& & 
(i Oo oF 


Tixtend this result so that two determinants of different orders may be multiplied. 
| ay" 


a! 
9. Show that. ie 





ayy | dx aye 
dat | Oy! ark ax 
10. What wd —_ one encounter in the case of «a determinant of infinite 


order? 
la at? +h an | 
11. Consider F(A) = | . hh? S h be =|.) 6 Bhow that FiO) = F'(O) = 0, 
| a he eth 


1.3. Matrices. A matrix is defined as a rectangular array of elements 
(usually the elements are real or complex numbers). An algebra of 
matrices is developed by defining addition of matrices, multiplication 
of matrices, multiphecation of a matrix by a scalar (real or complex 
number), differentiation of matrices, etc. The definitions chosen for 
the above-mentioned operations will be such as to make the calculus of 
matrices highly applicable. A matrix A may be denoted as follows: 


al ay -- > ap 
2 a 2 ' 1 : 
GG; Ga 8 8 5 ok o 1 beet on tea 
a ee po lal gy ft (8) 
i i eae ae 
a; Gs fo, 


If m = n, we say that A is a square matrix of order n. I! Bis the 
matrix of élements be l,e@=1,2,...,m,7 = 1, 2, ., n, then B is 
said to be equal to A, written B = A or A = B, if sail ois if a} = bi for 
the complete range of values of ¢ and 7. 


Hrample 1.11 
ot a5 | =| 2 0 “al 


ae tie fy | —> 2 th 





implies aj = 2,a) = 0, a) = —3, a? = —3, a? = 2, 2 = 6. 


12 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Two matrices can be compared for equality 1 and oniv if they are 
comparable in the sense that they have the same number of rows and the 
same number of columns. 

The sum of two comparable matrices A, B is defined as a new matrix C 
whose elements ¢} are obtained by adding the corresponding elements of 
Aand B. ‘Thus 


le] =C = A+B = flasl] + [lds] = lla} + of (1.32) 
We note that A +B=B-+ A. 


Bzample 1.12 
|3 2 = 15 0 t=} 2 | 
| i 


‘O-4 = 9h TT oe -ib lh 4 3 
3-2 0 Oo} | 3 "| 
0 S/+}0 G)=]| Oo 5 
-1 4 0 0 -1 4| 














We call A a zero matrix if and only if each element of A is equal to the 
real number zero. 

The product of a matrix A by a number & (real or complex) is defined 
as the matrix whose elements are each k times those of A, that is, 


kA = kllajl| = ||ka}| 





Brample 1.13 


413 8 -2 -|"5 0 3 


| 9 —4 34 § -16 12 
3 —2 4) [4 -1 elds —1 | 
L123 BESO. £ Bl Hr 2 2 
Every matrix A can be associated with a negative matrix B = —A such 


that A + (—A)} = (—A) + A = 0 (ero matrix). 

The rule for multiplying a matrix A by a sealar & should not be con- 
fused with the rule for multiplying a determinant by &, for im this latter 
case the clements of only one row or only one column are multiplied by &. 

Before defining the product of two matrices let us consider the follow- 
ing sets of linear transtormations: 


z= ati t 


A: Lb Becece acy SAL ang hh 
B: yf = hig* eee 


£ 


ay 6 se «wy B 


Since the 2’s depend on the y’s, which in turn depend on the 2’s, we 
ean solve for the 2’s in terms of the z's. We write this transformation 
as follows: 


AB: f= aha’ = oe cj, = abl 


This suggests a method for defining multiplication of the matrices A, B, 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 13 


If A = jlai|,i=1,2,...,m,j7 =1,2,...,n, B= |B, ¢ = 1, 2, 
n,j7 = 1,2, ... ,p, then AB is defined as the matrix C such that 
C = AB = |la' 


|b}\| = |jaibe|| = (1.33) 











Let us note that the number of columns of the matrix A must equal the 
number of rows of B. The matrix C of (1.33) is an m & p matrix. In 
the case of square matrices the definition for multiplication of matrices 
corresponds to that for multipheation of determinants [see (1.27)]. 
This implies that |C) = |A|- |B), where |C| denotes the determinant of 
the set of elements comprising the square matrix C = AB, 


Brample 1.14 














2 =] | 
—2Z 3 4 0 1 
4 —2 3 i 0 | 
2(3) + (-10 4001) 2(2) + (-1)(—1) + 00) _| 6 ‘ 
=| —2(3) +300) +(-—DG) —2(2) + 3(-1) + (-)0] =| —7 
4(3) + (—-290 +301) 4(2) + (—2)(—1) + 3(0) 15 ‘101 








Example 1.15, The product of two matrices A, B can yield the zero matrix with 
neither A nor B the sero matrix, 


QO 1] {1 a} _f[o 0 
1o Of jo OF} Jo O 


Example 1.16. The commutative law does not hold, in general, for multiplication 
of matrices. 
ly 1 le —7 
A+B =|; a} | of ia =O 
1 —] 1 |] L 1 
tie “|; at |e 0 | =|; ‘| 


Example 1.17. If A is a matrix whose elements are functions of & variable zr, we 











ke 3 LA 
define the derivative of A with respect to 2, written ae as follows: 











chr 
aA da' | 
= s (1.34) 
2 a 12) 
Thus if A = | ain +}, then —— = | eee I 
Ine | = 
| £ 
This definition follows logically from the following considerations: If A(z) = : “6 ; 








fiz + Az) 
gla + Az) 


then A(z + Ad) = : and 








14 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


fe + Az) ~ fiz) 








Alice -- at) — Ale) _ Ag | 
or 
ae} lim 
tha dA DEF | azo Az } _ | F'2) | 
— te lirn pit rb An) ae ha’ (x) | 
Ac— tl AF | 


Frample 1.18. The transpose of a matrix A is the matrix, written A’, obtained 


from A by interchanging the rows and columns of A. If A = i : . then 
|] 
AT=7 2 I} 
=—3 @ 








A square matrix A is said to be a symmetric matrix if and only if 
A= AT If A = —A’, we say that Ais a skew-symmetric matrix. We 
now exhibit a symmetric matrix A and a skew-symmetric matrix B. 


a ; 
a sO | Oo -t 8 
A= AT = 3) Uf) OB BT={ 1 0 -2 
<i & -3 2 0 





We let the reader verify that s(A + A") is a symmetric matrix if A 
is a square matrix. Let the reader first prove that (A")7 = A, 


(A+ Bj)? = A’ + BF 


it is easily seen that ¢(A — A”) isaskew-symmetric matrix. Any square 
matrix A can obviously be written as A = §(A + AT) + ${A — A®). 
Hence every square matrix can be written as the sum of a symmetric 
and a skew-symmetric matrix, 

Example 1.19, We now prove that (AB)" = BTA’. Let A = |jai|, B = [bi], 
AT = |\ci = afl, BT = [df = bi. Then BTA® = JdQcSl| = |lbfag\|, 1 = row, 7 = col- 
umn, am] AB = |la'b?|| so that (AB)? = jlajbF| = BAT. @.E.D. 

Let the reader now show that (ABC)* = C*BA’. 

A square matrix with 1’s down the principal diagonal and zeros else- 
where is called the identity matrix, written E = |/4!!) [see (1.28)]. We 
easily verify that. 

AE = jajl| - ||8;| = 'a,67! = lai, = A = EA 


Problems 


1. Show that A-+ B= B+ A. Commutative law of addition. 
9. Show that A 4+ (B +C) = (A 4B) 4+ C. Associative law of addition. 


LINEAR BQUATIONS, DETERMINANTS, AND MATRICES ld 


3. Show that A(B +C) = AR + AC, (A+ BIC = AC +4 BC. Distributive law 
of multiplication with respect to addition. 

4. Show that (AB)C = A(BC). Associative law of miultiplieation. 

6. From the cdistmbutive law show that OA = O1f0+4+ B=B40 = B for all B. 

6. Show that the system of equations y = atret,i = 1,2, ...,m,0 =1,2,.. 











n, nay be written in matrix form as ¥ =~ AX, 
| y gl 
j2 z* |«: he fh. | 
| 3 2 
2, a a 
where Y=] _ x= | A=|¢* a 
| | das om | 
” pe 
a , dx ere , 
7. Write the ayatem of differential equations yoo a tT = I, Bye atc Hy dR 
riatrix form, 
& Show that 
O 1 Of fel ah aff jal af aj] 
1 O Of} -Jat «af aff =lal as ay 
0 6 1h) jet a a; ay Of Oy] 











Find the square matrix E,, such that E,,- A interchanges the rth and sth rows of the 
square matrix A, 





a 4 - 3 as 
9. lf A = al i,¢ 21,3, ...3n, 3 = =) show that A- B = E. 
10. If Z = AY, ¥ = BX, show that 2 = (AB)X. 
xl 
11. Jf KX =} 2? |, show that KTR = [ta')* + (2st 4+ Gey]. 
zr? | 





12. From the associative law show that if AB = E, then BA = E. Hint: 
(BAIB = B(AB) 


und assume XB = B imphes X = E, 


13. If _ = AX, X = BY, B = |[b} = constants||, show that po 





Tf ae ABY. If 
B-lis a matrix (inverse of B such that B-'B = B), show that i {(B-'AB)Y. 
i " 
ye | As Cros 
. i dry" 
iy = ;: B IAB = » show that, ae Ly’ (no sum on 1), 
. | | ZOOS | 
yu” | | An | 
Se aR poe ee: 
14. Show that A = BB is a symmetric matrix. Then show that B-' = A“'B? if 


IB) + 0 (see Prob. 13 for the definition of Bo"). 

16. The trace, or spur, of a square matrix is defined as the sum of the terns along the 
principal diagonal, that is, trace A =«@&. Show that trace (AB) = trace (BA). 
Show that trace (TAS) = trace AifST = E. Show that in general (trace A) (trace B) 
= trace (AB). 


16 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The Inverse of a Matrix. A square matrix A is said to have an inverse 
matrix, written A-', if AA-' = ACA = E. Of necessity 


Al {A“] = [Bj = 4 


so that |A| = 0. Now let A be a nonsingular square matrix, |A| = 0. 
From (1.30) we have 


/Ag_ AR ica 
ile ia) = jae] a = & (1.35) 
so that the matrix A—!, with elements a has the property that 


Isee (1.33) for the definition of multiplication of two matrices]. Rerall 
that Aj is the cofactor of ai, 
The inverse of a matrix is unique, for 1f BA = E, then 


BAA"! = EA“! = A“ 
which implies BE = B = A-', 











io ro 2 
Ezample 1.20, LetA=}—-1 0 2] |A| = —3, Aj = = 2 
i110 pee 
rg. 1 ao 1 }—1. 
fe, | — — = 2 es mii. 
a Ty | br Ng | a oi 1 4 . 
ek tp i. | © et _{-1t e] | 
AD= |, ‘| 1 AB = | 3. ie! io 1 
gees cal E os eae! - 
mes Ve 2 " Pe os go 8 
py I 
a " 
mem 5g 
| 1 | 
| 3 § 0 
AA-l = ATA = E 


Eranple 121. Let us consider the set of all nonsingular matrices (|A, = 0) of 
order 3. This set 4 satisfies the following properties relative to the operation of 
multiplication : 

M,: AB = C,/C| = 0, and of third order if Aand B are nonsingular of order 3. This 
13 the closure property. 

fo: ACBC) = {(AB)C. The associative law. 


=, 
Sat 


ify: AE = EA = A for all A, where E = | { We call E the unit clement 


a 
bt ho 
fo 


te 





with respect to multiplication, E ta unique, 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES li 
Mf,: For every matrix A of § there exists a unique matrix Av! such that 
AA“i= AIA = E 


Any set 8 of elements which obey properties JW, to AW, relative to some operation 
feall it multiplication if you prefer) is called a group. Since AB = BA for some A and 
B, the above group is non-Abelian. 

lf we consider the second operation, in this ease addition, we also have 

Ay? A+H=B+Ac&C 

Ag? A+ (B+C) =(A+B)-4+C 


Ast AtO=-QO4A=A 0 =| 


oa 2 
Oo Oo & 
fo oS 


44? A+ (—A) = (—-A)+A=0 

Dy: A(B + C) = AB+ AC 

Dy: (B+CiA = BA+CA 

Let the reader prove that D,, Dy, A;imply A+ O = OA = QO. 

Although division has not been defined and will not be defined, quite a few rules of 
the real-number system hold for matrices. If AB = AC, |A| = 0, then 


AUAB = AAC 


which implies B=: C. This is the law of cancellation. If Y = AX, /A] = 0 (see 
Prob. 6, Sec. 1.2), then X = A-!Y. Let the reader prove this useful fact. 


From the equation 
ABB-'A-? = E = (AB)(AB)~! (1.36) 
we have (AB)~! = BUA-!, The inverse of a product of matrices is the 


product of the inverses in reverse order. Let the reader show that 
(A—")? — (AT)-!, 


Problems 
1, From the equation AA~! = E, show that A = A? implies Av! = (A™!)", 
2. Let A be a skew-symmetric matrix of odd order, A = —A*, Show that |A) = 0, 
3. Show that (A7')"! = A. 
4. Show that (AUBA}* = A7'BtA, 
6. Show that A-'B-'A is the inverse of A-'BA. 
6. Show that (A7BA)(A7'CA) = AN BC)A, 
7. Find the inverse matrix of the matrix 
12 3 4] 
i—1 1 2 3 
—2 4 i 2 
3 1 -l 2 


1.4. Linear Equations. Linear Dependence. We first consider the 
single linear homogeneous equation 


ajc! + aor® +--+ + a,c" = 0 n> l:a, #0 (1.357) 


An obvious solution of (1.37) is x! = 27? = - ++ = 2" = 0, called the 
trivial solution. We may obtain nontrivial solutions by choosing any 


1s ELEMENTS OF PURE AND APPLIED MATHEMATICS 


values we please for 2*, #*, ... , 2", which in turn uniquely determine 
z', Moreover, if ct! = cl, x? =c*,... , 2" = cc" 16 4 solution of (1.37), 
then 2? = de}. BS Ayes ey SP SA, wo A <<. + oO, is also a 
solution of (1.387). The equation « — 2) = 0 has the solution z = 2h, 
y=aA,-*7 <A < +0. The equation 2x — dy + 42 = 0 has the solu- 
tion z = u, ¥ = 1, 2 = z(3v — 2u), —~e cud +o, -—-eo cy dc +n, 
Let us now consider the following system of homogeneous equations: 


rc+t2y—dz =0 


at — dy +2 = O (1.38) 


lt is obvious that « = y = z = Ois a trivial solution of (1.38). We look 
for nontrivial solutions. From (1.38) we obtain 7y — 9z = 0 by elimi- 
nating z, 60 that 2 = 7h, y = 9A, and thus « = LOA, ~© <A << +m, 
is a solution of (1.38), 

In dealmg with a system of linear homogencous equations We can 
multiply any equation by a scalar without changing the system. More- 
over, we may also multiply any equation by a scalar and add the result- 
ing equation to any other equation without changing the solution of the 
system. In these operations only the coefficients of the unknowns play 
a role. Henee one needs only to manipulate the elements of the coefh- 


: | il.o6OU8l AT , 
cient matrix. For (1.38) we have log 9 ; which transforms into 
1 2 -4 — yy Sn 
lp —7 g | by multiplying the first row by —2 and adding the cor- 





responding elements to the second row. We then solve —7y + Ye = 0 
for y, 2 and then solve for + 1n the equation z + 2y — 42 = 0. 


Example 1.22. Let us look at a aystem of four linear homogencous equations 1 
five unknowns. 

x—y + Ze + de + ar = 0 

er tye tus 4 = 0 


| 11 
—G2 + 2y +42+u=0 (1.50), 
4y + 42 + 2p = 0 
We now triagonalize the clementa of the cocfhicient matrix 
i —-l 20 @ 
2 1 —-1 1 4 
; 
-3 2 410 ikea) 
0 4 402 


Multiplying row one by --2 and adding to row two, and multiplying row one by 3 and 
adding to row three, yrelds 


je ey 4 4 
0) 3 =) =5 =—2 
Oo =—-1 10 £146 q (1.41) 
| 0 4 4 @ 2 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 14 
Continuing, (1.41) transforms into 


1 -l 2 a 
O —1 10 10 





{) i) I J (1.42) 
0) Oo 0 —4 
At one stage we interchanged rows two and three. 
The equivalent system of equations is 
—4du — fr = 0 
zf+utoe = f) . 
—y +102 + 10u 4+ 9v = 0 A 
e—y +32 + 3u + 32 = 0 
Letting » = —2 yields in turn vw =3, 2 = —-l,y = 2,2 = 34, The most general 
solution of (1.39) isz2 =}, y = 24,2 = —h, uw = BA, ye = —2A, —*~ SA < +o, 


From the above consideration the reader can prove by mathematical 
induction or otherwise that a system of mn linear homogeneous equations 
in m unknowns always possesses a nontrivial solution if m > n. 

We now consider the case m = n. The system 


ax=0 itj=1,2,...,2 (1.44) 


has the unique trivial solution 2! = 2* = +++ = a" = O1f jai] =~ Ofrom 
Example 1.10. The only possibilty for the existence of a nontrivial 
solution occurs for the case jai] = 0. Triagonalizing the matrix ||a%!| 
yields a new equivalent triangular matrix |b] such that |b‘] = 0, Let 
the reader explain this. We thus obtain 


hi; be sew Bf 

O be coer 
O O BE --- BL =O 
| 

0 0 0 -:- ® 


which implies 6 = 0 (no summation) for at least one value of 7. If 
bs = 0, we have reduced our original system to one containing more 
unknowns than equations, for which a nontrivial solution exists. If 
b =f 0, be! = 0, then z" = 0 and again we have more unknowns than 
equations so that a nontrivial solution again exists. Continuing, we see 
that the vanishing of at least one element along the main diagonal 
implies the existence of a nontrivial solution. 


Erample 1.23. The determinant of the cocflicient matrix of the system 


etytetu=0 
2r -y +ez—u =O 
os — y + de = 0 

—r +ay +z+ fu = 0 


(1.45) 


1) ELEMENTS OF FURE AND APPLIED MATHEMATICS 


vanishes. Trisgonalizing the coefficient matrix yields 


I1 1 1 1] 
Lt tf 8 
io )60lCOtiCS 
10 {} ) —38 


E 


System (1.45) becomes —Su = fu = 0 a0 that u = 0, and we have sz + y +s = QO, 
—sy —2 = 0, 30 that 2 = —SA, y =A, 2 = 2A, —@ SA fm, 


On considering the system 
zr—y=0 
274 -y = 0 (1.46) 
z+y = 0 


we see immediately that only the trivial solution 7 = y = 0 exists. 

Generally speaking, a system of m hnear homogeneous equations in 
n unknowns, m > , does not possess a nontrivial solution. The reader 
is referred to Ferrar’s text on algebra for the complete discussion of this 
case. The rank of a matrix plays an important role in discussing solu- 
tions of linear equations. For a discussion of the rank of a matrix see 
the above-mentioned text. 


a 
Lk 





Hi 


Brample 1.24. The column matrix X = | . | will be called a vector. The number 


‘| 
ie 
vi is called the 7th component of X. We call the number, n, the dimensionality of the 
apace, The determinant af the matrix K’X is called the square of the magnitude of 
the vector X, If the zt, i = 1,2, ... ,", are complex numbers, we define [XTX] as 
the aquare of the magnitude of X, where XT = |ziz? - - + Z|], # = conjugate com- 
plex of 2°. 


The system of vectors 


= & 
a - 


fs TB. es « em (1.47) 





, 


cal 
is said to be linearly dependent if there exist scalars A‘, ¥*7, 2... =, A@™ 
not all zero such that 


1 


XK, = 0 (1.48) 


An equivalent definition of linear dependence is the following: The 
set of vectors (1.47) is a linearly independent system if Eq. (1.48) implics 
Ri AP = - + = Am = OD. 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 2] 


We now prove the following theorem: Any m vectors in an #-dimen- 
sional space are linearly dependent um > n. The proof is as follows: 
The system of equations At%, = 0 is equivalent to the system of x 
linear homogeneous equations in the m unknowns Al, A7,.. . , 4" 


eho = O} Be eee a en ES ELD nee 
Such a system always has a nontrivial solution form > nn. Q.E.D. 


Problems 


i. Solve the system 
¢— fy +32 — 4u = 0 
26 --yoezetu = 0 
de —ytiz—u = 0 

2 polye the system 
a+ 2y—z—du =0 
Zr — ype + 4u = 0 

a. oolve the system 
26 + y—etu =) 
ro-potpuell 
x —4y —- 22 + 2u = 0 
fer -- a -~ dat + ou = O 


é. Determine \ so that the following system will have nontrivial solutions, and 
solve: 























AZ = 47+ y 
Ay = —2e + ¥ 
] 0) 1} 
6. Show that the vectors X, = |1|, K, = [| —1/, KX; =| —1 | are lmearly independ- 

1 1! | O 

2 | 3 

ent. Grven X =] --1 find sealara Ay, Ae, Aa SUCH that A = > ‘edhe. 
5 {=1 





1.5. Quadratic Forms. The square of the distance between a point 
P(x, y, z) and the origin O(0, 0, 0) in a Euclidean space is given by 


[2 = gi+ 4? + 23 (1.49) 


using a Euclidean coordinate system. The generalization of (1.49) to 
an n-dimensional space yields 


f= y (a)! (1.50) 
The linear transformation z? = agy*,2,@ = 1,2, ... , 2, |ai| = 0, yields, 
irom (1.40), 


L? = y aia’, yty? (1.51) 


i= 1 


J ELEMEATSA OF PIDHE AND APPLIED MATHEMATICS 


Ii we desire that the y's be the components of a BMuchdean coordinairc 
system, we heed 


EL St * (y*)* (1.52) 
i=1 
Comparing (1.51), (1.52) yields 
Yo, .. 1 ife=8 


The svstem of Eqs. (1.53) in matrix form becomes 
ATA = AAT = E A = |ail (1.4) 


Equation (1.54) implies in turn that A? = AT. A matrix A satisfying 
(1.54) is called an erihogonal matrix, Ji A,, A: are any two different 
column veetors of A, then Af-A. = 0. We say that the vectors are 
perpendicular, 


Th 

| | ~ | 

For complex components (1.50) becomes L* = > fz? and for the 
i= 


ys to be the components of an orthogonal coordinate system we find 
that the matrix A must satisfy ATA = E or AT = Aé', where A = | ail!, 
d; the complex conjugate of a}. If AT = A-|, we say that Ais a wndtar) 
matrix. 

We now consider the quadratic form 


QO = agartrF feg Tealo a, 8 = 1,2,. 0.44.7 (1.55) 
and ask whether it is possible to find an orthogonal transformation 


X = BY such that (1.55) reduces to the canonical form 


Aly')* (1.56, 


l 


() = 


iIhMMe 


z 


Let the student note that (1.55) may be written in matrix form as 
= X7AK (1.57) 


noting that, XTAX is a matrix of just one element and so is written as 
a scalar. Under the transformation X = BY, (1.57) becomes 


Q = (BY)TA(BY) 
= Y'(BTAB)Y (1.58) 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 19 


MAntinuing, (1.41) transforms into 


1 -1 2 3.3 
0 -1 10 10 9 
0 0 1 11 a) 
0 0 O -4 6 
av vue svage we interchanged rows two and three. 
The equivalent system of equations Is 
—4u — 60 = 0 
ztutv=0 
~y +102 + 10u +9 =0 ay) 
a —y +2z + 38u + 3v = 0 
Letting v = —2 yields in turn uw = 3, z = —1, y = 2,7 = 1 The most general 
solution of (1.39) isz =A, y = 2A,z = —A,u = 3A, v = —-2dA, ~~ <A < +o. 


From the above consideration the reader can prove by mathematical 
duction or otherwise that a svstem of n linear homogeneous equations 
m unknowns always possesses a nontrivial solution if m > n. 
; We now consider the case m = n. The system 


ar? = 0 1.9. 241,268 gm (1.44) 


the unique trivial solution 1 = 22 = +++ = 2" = Oif |a}| ¥ O from 
imple 1.10. The only possibility for the existence of a nontrivial 
ition occurs for the case |a:| = 0. Triagonalizing the matrix |la‘| 
ds a new equivalent triangular matrix |/b'|| such that |b‘| = 0. Let 
ireader explain this. We thus obtain 





bib tae bl 
0 ; fia b? 
00 DB --- Bl=0 
0 0 0: b, 


=h implies bt = 0 (no summation) for at least one value of 7. If 
: 0, we have reduced our original system to one containing more 
nowns than equations, for which a nontrivial solution exists. If 
EQ, br-} = 0, then zx” = 0 and again we have more unknowns than 
tions so that a nontrivial solution again exists. Continuing, we see 
the vanishing of at least one element along the main diagonal 
ies the existence of a nontrivial solution. 


mple 1.23. The determinant of the coefficient matrix of the system 


rtytz+u=0 
2x —-y+z2—u=0 
5z — y + 32 = 0 
—zr+5y +2+2u =0 


(1.45) 


20 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
vanishes. Triagonalizing the coefficient matrix yields 


1 1 1 ] 

0 -3 -1 -3 

0 0 0 7 

0 0 0 -3 
System (1.45) becomes —3u = 7u = 0 so that u = 0, and we have tr +y +2 = 0, 
—3y —z = 0, so that z = —3\, y =A, x7 = 2’, -~ <A KS 4+. 


On considering the system 
x-y=0 
2x +y=0 (1.46) 
x+y =0 


we see immediately that only the trivial solution x = y = 0 exists. 
Generally speaking, a system of m linear homogeneous equations in 
n unknowns, m > n, does not possess a nontrivial solution. The reader 
is referred to Ferrar’s text on algebra for the complete discussion of this 
case. The rank of a matrix plays an important role in discussing solu- 
tions of linear equations. For a discussion of the rank of a matrix see 
the above-mentioned text. 
ay 
av 
Example 1.24. The column matrix X = wil be called a vector The number 
my 


x is called the jth component of X. We call the number, n, the dimensionality of the 
space. The determinant of the matrix X7X is called the square of the magnitude of 


the vector X. If the r',7 = 1,2, ... ,”, are complex numbers, we define |X7X| as 
the square of the magnitude of X, where X7 = ||f1%2—- so Z"||, # = conjugate com- 
plex of 2’. 


The system of vectors 


md 


* « 
23 


xX, =| r=1,2,...,m (1.47) 


nN 
r 


wt 








is said to be linearly dependent if there exist scalars \!, A2, .. . , A” 


not all zero such that 
2X, = 0 (1.48) 


An equivalent definition of linear dependence is the following: The 
set of vectors (1.47) is a linearly independent system if Eq. (1.48) implies 
Ab=)\P = +--+ =A™ = 0. 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 21 


We now prove the following theorem: Any m vectors in an n-dimen- 
sional space are linearly dependent if m > n. The proof is as follows: 
The system of equations A*X, = 0 is equivalent to the system of n 
linear homogencous equations in the m unknowns A!, A?, . . . , A” 

xLAz = 0 Pe ND. ae oe ek we 


Such a system always has a nontrivial solution form >n. Q.E.D. 


Problems 
1. Solve the system 
x— 2y +32 —-—4u=0 
Zr ty —-—z2z2t+u=0 
bn — y + 2z2 —u = 0 
2. Solve the system 
ra+2y —2z2—3u = 0 
2r—ytzt+4u =0 
3. Solve the system 
2ar+y-—-z+u=0 
F—-YyY-zr+u () 
x—4dy — 22 +2u =0 
ox +y — 324+ 3u = 0 


4. Determine so that the followmg system will have nontrivial solutions, and 
solve: 














Ac =4r+y 
Ay = —2e + y 
| 1 0 1 
5. Show that the vectors X: = | 1, X. = | —1 |, Xs; =| —1] are linearly independ- 
_ 0 
9 3 
ent. Given X = —1], find scalars Aj, Ae, As such that X = > AK. 
| 3 1=1 


1.50 Quadratic Forms. The square of the distance between a paint 
P(x, y, z) and the origin O(0, 0, 0) in a Euclidean space is given by 


PP=2z+y? + 2? (1.49) 


using a Euclidean coordinate system. The generalization of (1.49) to 
an n-dimensional space yields 


Ll? = Ce) (1.50) 
a 
The linear transformation x* = ajy%,7,a@ = 1,2, ... ,n, |a’| # 0, yields, 
from (1.50), 
L? = > atay, yzy? (1.51) 


1=] 


22 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


If we desire that the y’s be the components of a Euclidean coordinate 
system, we need 


Lt= Y (y)? (1.52) 
t=] 
Comparing (1.51), (1.52) yields 
. tcc 4 ife = Bp 


1=1 


The system of Eqs. (1.53) in matrix form becomes 
ATA = AAT =E A = |la}|| (1.54) 


Equation (1.54) implies in turn that A? = A-!. A matrix A satisfying 
(1.54) is called an orthogonal matrix. If A,, Ae are any two different 
column vectors of A, then A?-A, = 0. We say that the vectors are 
perpendicular. 


nr 


For complex components (1.50) becomes L? = > xx’, and for the 
1= 1 
y’s to be the components of an orthogonal coordinate system we find 
that the matrix A must satisfy ATA = E or A” = A~', where A = |la|, 
a: the complex conjugate of aj. If AT = A-', we say that A is a unitary 
matrix. 
We now consider the quadratic form 


Q = dapu%x? Gap real; a, 8 = 1,2,...,7 (1.55) 


and ask whether it is possible to find an orthogonal transformation 
X = BY such that (1.55) reduces to the canonical form 


Q = > (yr)? (1.56) 


~=] 


Let the student note that (1.55) may be written in matrix form as 
@ = XTAX (1.57) 


noting that X7AX is a matrix of just one element and so is written as 
a scalar. Under the transformation X = BY, (1.57) becomes 


Q = (BY)7A(BY) 
= Y"(B7AB)Y (1.58) 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 23 


so that (1.58) will have the form of (1.56) if and only if 


1 0 0 -:: O 
0» 0 -:: O 

B7AB=|/0 0 As --: 0 (1.59) 
h@. oO: @° soe ® 


that is, B"AB must be a diagonal matrix. Our problem has been reduced 
to that of finding a matrix B satisfying (1.59), and hence satisfying 


B-'AB = I|A.5.5]| 


or AB = BI.6,| it 


since B? = B-' is required if X = BY is to be an orthogonal transforma- 
tion. We may consider A to be a symmetric matrix, A = A”, since 
Q = X7($(A + AT)JX + X7[5(A — A™)]X, and X7([3(A — A7)|X = 0 (see 
Prob. 9, Sec. 1.1), while §(A + A”) is a symmetric matrix. 

We now attempt to find the square matrix B satisfying (1.60). If 
B = |(b,,||, Eq. (1.60) becomes 


n 


> Oabe, = > barada, = bj, %59=1,2,...,n (1.61) 


a=] a=] 


For 7 = 1 we have 


n 


> Aradar ie biidvy 


a=] 
| by bis 
bei bey 
or A ° . — Ay (1.62) 
bn1 Bai 
Oi 
bor 
If B; is the column matrix (vector), | - |, (1.62) may be written 
Oni 
AB, = rB, A = At (1.63) 


or (A — \E)Bi = 0 


24 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Equation (1.63) represents a system of 7 linear homogeneous equations 
in the n unknowns comprising the column matrix B;. From Sec. 1.4 a 
necessary and sufficient condition that nontrivial solutions exist is that 


IA — E| = 0 
Q11 7 r 32 tS Qin 
or Sit. oon l= (1.64) 
Qnl QAn2 Ann — r 


We call (1.64) the characteristic equation of the matrix A. It is a 
polynomial equation in \ of degree n, so has n roots dj, Ae, . . ~ 5 An, the 
dX, real or complex. The roots \;, 2, ..., An determine the column 
vectors B,, Bo, . . . , B,, which in turn comprise the matrix B, that is, 


B= |B, Bo. -:- B,|| a square matrix 


The solution B, of AB, = \,B, is called an ezgenfunction, eigenvector, 
or characteristic vector. 1 1s called the eigenvalue, or latent root, or char- 
acteristic root corresponding to the eigenfunction B;. If B, is a solution 
of (1.63), so is B,/length of B,, a vector of unit length. Ifthe B,,7 = 1, 2, 
..., , are unit vectors, it is easy to prove that the matrix B is an 
orthogonal matrix. Let 


AB, = 1B, 


AB. —),B, (1.65) 


Then (AB)? = B?A’ = BTA = 2B? so that BJAB. = \;B7B,. More- 
over B?AB2 = \2B7B:, so that (A, — \.)B7B, = 0 and B7B, = 0. If the 
u., 2= 1, 2,..., 7, are all different, the matrix B is an orthogonal 
matrix. 


Example 1.25. We now find the linear orthogonal transformation which transforms 
the quadratic form Q = 7x? + 7y? + 722 + 6ry + 8yz into canonical form. We have 

















~ 3 Off2z 
Q=|x y alls 7 4tly 
0 4 Tile 
The characteristic equation is 
7—\) 3 0 
3 7— > 4 = 0 
0 4 7—» 








which reduces to (7 — A)(A — 12)(4 — 2) = 080 that Ay = 7, Ae = 12, 43 = 2. For 
Ay = 7 Eq. (1.63) becomes 3b. = 0, 3b; + 4b; = 0, 4b2 = 0. A unit vector B; whose 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 25 


4 
5 
components b), by, b3 satisfy these equations is B} = | 0 For \2 = 12 Eq. (1.63) 
3 
— =5 
3 
5 V2 


1 
becomes —5b; + 3bz = 0,3b1 — 5b2 + 4b; = 0, 4b2 — 5bs = 0,80 that By = Va 








_4 
5/2 
2 
5 V2 
] 
For A3 = 2 we obtain B; = | Va Thus 
4 
5 V2 
| 4 3 _ 3 = 4 3 __3 . 
5 5/2 5 V2 5 §65V2 5V2 
1 1 1 ] 
B = 0 a as d Yi = 0 2S a v 
V2 V2 ia V2 V2 
ne A ; 8 eh eda 
| 5 52 52 5 5V2 5 V2 
1 
so that x= —— (4 V2u + 30 — 3w) 
5 V2 
1 
y = —— (50 + 5w 1.66 
Deas ( ) (1.66) 
} 3 
z= —-(-3 V2u + 40 — 4w 
5 V2 
The quadratic form Q = 7x? + 7y? + 72? + 6ry + 8yz becomes 
Q = Tu? + 120? + Qw? 
under the linear orthogonal] transformation (1.66). 
Each root \,,7 = 1,2, ... ,n, of |A — AE] = 0 determined a column 


vector of the orthogonal matrix B. If multiple roots of |A — \AE| = 0 
occur, it appears at first glance that we cannot complete the full matrix B. 
However, we can show that an orthogonal matrix F exists such that the 
quadratic form Q = X7AX, A = AY’ is canonical in form for the trans- 
formation X = FW. 

First let us note the following pertinent facts: If X = BY, Y = CZ are 
orthogonal transformations, then X = (BC)Z is an orthogonal trans- 
formation. We have (BC)? = C7B’ = C—'B— = (BC)-!, so that BC 
is an orthogonal matrix. In other words, the matrix product of orthog- 
onal matrices is an orthogonal matrix. Next we note that, if B is an 


26 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


orthogonal matrix in a k-dimensional space, k < n, then 


0 0 
0 0 
C= 1 0 0 
0 
! B 
: | 
0 





is an orthogonal matrix for the n-dimensional space. The reader can 
easily verify that C has the necessary properties for an orthogonal matrix. 
Finally the roots of |A — \E| = 0 are the roots of [B~1AB — dE] = 0, 
and conversely. From B-!AB — XE = B-!(A — XE)B we have 


IB—'AB — \E| = |B-1(A — XE)BI 


= |B-'| |A — AE| |B 
= |B~'| |B] |A — dE 
= |B-'B| |A — AE| 
= |A —E| 


which proves our statement. 

Now let d,; be any root of |A — AE| = 0. 1t is immaterial whether ), 
is a multiple root. We solve (A — \,E)B, = 0 for B,, B, a unit vector. 
We now obtain an orthogonal matrix B with B, as its first column. This 
can be done as follows (the method does not yield a unique answer): Let 
Bz = |\b,2|| be the elements of the second column of B. In order that B» 


be orthogonal to Bi, we need > 6.16.. = 0. This is a single homogeneous 


a 1=1 


equation in the unknowns b., 7 = 1, 2,...,n". We know that we 


mt 


can find a nontrivial solution which can be normalized so that ) b2, = 1. 


1=) 


nr nr 
To obtain the third column, we need > 6.16.3 = 0, > bobs = 0. A 
t=] t=] 
normalized nontrivial solution exists for n > 2. By continuing this 
process we construct an orthogonal matrix B. The final column of B 
involves a system of n — 1 equations in m unknowns for which a nontrivial 
solution exists. From the construction of B it follows that B-1AB has ), 
as the element of the first row and first column and has zeros elsewhere 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 27 
in the first column. From 
(B-'AB)? = B7A7(B-!)T = B-!A(B7)-! = B-!A(B-})-! = B-'AB 


we note that B~-'AB is a symmetric matrix. We have used the facts 
that A = A’, B? = B-!. Hence under the orthogonal transformation 
X = BY, Q becomes 


Q = XTAX = Y7(B 'AB)Y 





1, 0 0 --+ Olly 
0 yy? 
QO . 
= lly yR vee yy: C 
Fe y” 
0 
= Aa(y!)? + > Capyny? = » Capy?y” 
a,B=2 a,B=1 
The characteristic roots of the matrix |\ci|| = B-!AB satisfy 
[Ar — 0 0 ce 0 | 
(0) Con — Xd Cox Se Ce, | 
(0) Cae Cygre a OF C31 | 
. |= 0 
0 Cn2 Can — Ds 


Hence the remaining roots of |A — AE| = 0 satisfy 


Coo — A C23 Po, Con 
C32 Ci Kh SF C3n 


Cre Cun — 


By the same procedure we can reduce > Capy7y® to the form 
a,pB=2 


vu 


ho(z?)? + > A ape%2? 


a,p=3 


so that Q becomes Q = d,(z!)? 4+ hote?)? + 


a 


dagz2z® with \y = Ae if Ay 
3 


TIO18 


28 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


is a repeated root. Continuing this process reduces Q to canonical form. 
Q.E.D. 

1.6: Positive-definite Quadratic Forms. Inverse of a Matrix. Let us 
consider the quadratic form 


Q = X7(S7S)X (1.67) 
where S is a triangular matrix, S = ||s‘||, ss = Oforz >j. From (1.67) 


we have Q = (X7S7)(SX) = (SX)7(SX) = > (staz*)*. Itis obvious that 
1= 1 

Q > 0 unless s\x* = 0,7 = 1, 2,...,n. If |si| 40, we know that 

this system of equations has only the trivial solution 


pie zyra=:-: = 7 = 0 


Thus the special quadratic form (1.67) for |S| # 0 has the property Q > 0 
unless g2 = a* = +++ =a" =0. If Q = X7AX > O unless X = 0, we 
say that Q is a positine-definite quadratic form. For such a form it can 
be shown that 


Qi1 Q12 138 
> 0, Go. Q22 Qe3| > O 
Q31 Q32 Q33 


Q11 2 
A221 A2o2 








lay > 0 


la;| = |A] > 0 (1.68) 
(see ‘ Algebra,” by W. L. Ferrar, Oxford University Press). Conversely, 
(1.68) implies that Q is positive-definite. 
The above considerations suggest that, if Q is a positive-definite form, 
@ = X7AX, A = A’, then there exists a triangular matrix S such that 
S’S = A. We now show that this is true. The equation S7S =A 
implies 


SaSa =, tf =1,2,...,7 (1.69) 


a=] 


For 2 = 7 = 1 we have s?, = a1; so that sy, = (ay,)?. Fort =1,j > 1 








we Obtain 81181, = @1, SINCE Soi = S31 = * + * = Say = 0, so that 
Q1; de 
Sy = <= = 41,(an)7 
S11 
7=2,3,...,n. Fort =7 = 2 we have (812)? + (see)? = ae2 so that 
3 
2 \5 Q11 Q12 
- _ 2 )t = _ ae) _ 
$22 = (ee S74) = | Gee —— _ Qe1 A222 
Q)1 Pda sip 
aaa 


Remember that a = a. For 7 = 2,7 > 2 we have 81081, + S228; = Qe, 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 29 


and So, = (de, — $1281,)/See. Continuing this process, one obtains the 
set of equations 


811 >= (a11)3 
4, . 
Ay 


S22 = (dex — Sify)! 
doe, — 8,28 ; 
&,=— “=v 7>2 (1.70) 


ss 8+ © ©  @  #  «@ 


Sik = re a (st, 4 83, ae Be ae Sk_1,4) }? 
Ant — (SyrS1r + SegSer tt Ska S11 
not Lent tan + Em 
Skk 


eo @e@ e@ @ @®  @ @©  e@  @ ee @ ee @¢ ee @  ® e ee ©  je® e@ e« e@ je e# oe oe e& # @#® 


Given the numbers a,, the elements s, can be calculated step by step 
from (1.70). Let the reader show that 


Qi1 G12 A483 
Qo; QAoq Ax 
M31 QA3z2 A33 
Pe 
Sages yeas eos ” (1.71) 


Qi1 Qi12 
Qo; Age 








The above method for determining S can be used to find the inverse 
matrix of A provided X7AX is positive-definite Since A = S7S, we 
have A-! = $-!(S7)-! = $"'(S"1)7.__ To find S—! from S, we proceed as 
follows: Let T = S71,2, = 0,72 >]. From TS = E we obtain 


> ieee Gey Weg ain 5 (1.72) 


a=] 


Equation (1.72) enables one to solve for the ¢,. Thus ¢,:s;; = 1, or 
ty = 1/siy. For 1= 1, j = 2 we have t11812 + t198eo = 612 = (0) so that 
tig = —t11812/8e2. Continuing, we can compute fy3, ... , tin, tea, tes, 


7 4 Lins 


Example 1.26. We invert the matrix 


2 4 
A=/2 5 8 
1 3 4 


From (1.70) si = 1, Sig = 2, 813 = 1, S23 = 1, 833 = 1/2. From (1.72) t, = 1, 


oe —2, tis = —(t1i8i3 + t12823)/833 = 1/4/32: loo = 1, bog = — 1/V2, i33 = 1/V2. 
us 


30 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


1 

1 -—2 — 

Vv 

1 

S-!' = 10 ] Se Se 

/2 

1 

0 0 ——— 

/2 

and 

t- 52 1 0 0 11 —5 1 | 
/2 
fe ~1/Q- 1 = Lt. oe 
ATt = S-1(S71)7 = | 0 Ls 2 1 bo fag,-4 8 1 | 


1 1 1 he} 
V2} |Vv2 V2 v2 





We now invert the matrix A by another method. Let us first note that the matrix 


1 k O 
E}. = 0 J 0 
0 0 1 








has the property that E,.B is a new matrix which can be obtained by multiplying the 
second row of B by k and adding these elements to the corresponding elements of the 
first row of B. Thus 


bir bie Das 
bor bee bes 


bs bse baz 


— 
— 


bor bee bes 


bat bse bas 


1 k O bir + kber bie + Kbe2 bis + kbea 
0 1 0 
0 0 1 

















Let the reader show that placing k in the rth row and sth column of the unit matrix E 
produces a matrix E,, such that E,,B is a matrix identical to B except that the elements 
bra, @ = 1,2, ..., n, are replaced by bra + kbsa, r #8. Notice that |E,,| = 1 for 
r 8. Now let A be a square matrix such that |A| + 0. We consider the matrix 
C = ||A, E||. For the A in Example 1.26 


1 
C =;2 (1.73) 
1 


Wo b 
oo = 


1 
3 
4 


Or o& 


0 
0 
1 








We manipulate C by operations on the rows until the first three rows and columns of C 
become @. These operations are equivalent to multiplying C on the left by matrices 
of the type E,, discussed above. Let B be the product of all the E,,. Then 


BC = ||BA, BE|| = |E, B|| 
Hence BA = E, and B = A~). We obtain B from ||E, B||. For example, starting 


with the C of (1.73), we multiply the first row by —2 and add to the second row, and 
subtract the first row from the third row. This yields 


12 1 t oO O 
011 -2 1 0 
013 -1 0 1 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 31 


We can easily obtain zeros in the first and third row of the second column. We now 
have 


Adding # the third row to the first row and subtracting g the third row from the second 
row yields 


11 5 1 | 
1 Oo 0-7 72 3} 

5 38 1 
010-3 § -5 
0 0 2 I —] ] 


Factoring 2 from the third row, multiplication by yields the inverse matrix 


Co Oo 
oO fF © 
Nie Oo © 


Ant = 5 5 3 =] 


ll -5 J 
} -1 ] 





1.7. Differential Equations. We now consider the system of differ- 
ential equations 


d2x 


ye = mat? t1,a=1,2,...,n;,a5 = aj = constants 


which can be written in the matrix form 


2 
al = AX A=A? (1.74) 
We look for a linear transformation which will simplify (1.74). Let 
X = BY, so that (1.74) becomes 


2 
cx = (BOAB)Y (1.75) 


From previous considerations we have shown that it is possible to find 
a matrix B such that B-/AB is diagonal (nondiagonal terms are zero). 
For this matrix B we have 





d*y" , : , 
de 7 AY no summation on 1 (1.76) 
1 O O -:-:: O 
since B'AB={|0 O As :-:: O 


eo @  e © ee o@© $@  @  e#  j@®  j@® 


32 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The solution of (1.76) is 
yi = CleVt + Die-vst =f = 1,2,...,0 (1.77) 


From X = BY we can solve for z‘(é),2 = 1,2, ...,n. They‘ = 1,2, 
. , ”, are called normal coordinates. 


Example 1.27. Let us consider two particles of masses m1, mo, respectively, moving 
in a one-dimensional continuum, coupled in such a way that equal and opposite forces 
proportional to their distance apart act on the particles. The differential equations 


of motion are 
d?2z 


™1 = —a(z; = Z2) 
(1.78) 
d v2 

Me aa a(x: — Ze) 


For convenience, let y1 = +m, 21, Y2 = m2 22, wo] = a/ms, w} = a/me, 














a 
(mym2)4 ee 
so that (1.78) becomes 
d?y, a 2 k 
“ai? = —wyyi + kyo 
d? 
ad = ky: — whys 
@y diy —wi k | 
—— = — = = AY 
= dt? ~~ di? | Y2 k  —wil | ye 
The characteristic equation is 
2 
“Wy, Xv k | as 
k —w3 — | my 





so that \1 = a{ (wt + w2) + [(w1 — we)? + 4k2]4}. Let the reader find y:(t), yo(t), and 
hence 21 (¢), 22(t). 
( 
1.8. Subdivision of a Matrix. A matrix A may be subdivided into 
rectangular arrays, each array in turn being thought of as a matrix. For 
example, 








2 —1 3 4 2-1; 3 4 
0 1-1 2 0 1-1 2 
A=/ 383 2 1 5/=]/ 38 2: 1 5 
-l1 4 6 -2 —-1 4! 6 —2 
0 -2 1 1 0 —2: 1 1 
“lh 
A; A, 
where 
3 2 1 5 
a= (9 t od A; = —] 4 A, = 6 —2 
0 —2 l 1 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 33 


The linear system 



































y' = ax 1,a = 1, 2, ,n 
may be written 
k n 
y= ) ayer + atx a=1,2,3,...,kk+1,...,n 
a=l1 azk+l1 
In matrix form 
] 1 1 ] 
y} ay Qo Qy | Apa an 1 
2 2 2 2 
y? Q) A» Qy : Agay an x? 
k k k 
YL= te a tr Oe i Qig 1 a, |} a 
yet) ait} ast? qi*} ! ae qi! ght 
n nN ' nT nr 
y” | ai A» ° - a; Qigy e . ° An rn 
a Y, _ A; A>; . Xi 
Y> A; A, Xo 
Hence Y, = AiX, + AX, (1.79) 
Yo = A3sX, + AuXe : 
If we are not interested in solving for x*+!, x*+?,... , x", we can 


eliminate X_ from (1.79). We obtain 


Xo = A7z)(Ye a A3X1) |A.| xz 0 
so that Xi = (Ay = A:,A71A3)—1(Y) <= A2A7z!Y2) 


In a mesh circuit the y' are the impressed voltages, the a; are the imped- 
ances, and the z* are the currents. 


1.9. Conclusion. In the calculus the solution of e = az, &@ = con- 


stant, is + = xoe*. Can we generalize this for the system = = AX? 


We see immediately that one would be led to consider matrices of the 
form e4*, How should we go about defining such a matrix? From the 
calculus e? = > z"/n!, This suggests that if B is a square matrix we 


n=Q 


define 


34 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
®aE+B+S B+ +--+ B+: : (1.80) 


This poses a new problem. What do we mean by the sum of an infinite 
number of matrices? We define 


1 i 
B,=E+B+5,B7 +--+ +58 


as the rth partial sum of (1.80). If lim B, exists in the sense that each 


ros @& 


element of B, converges, we define 


eB = lim B, 
If B is a square matrix of order n, whose terms b are uniformly bounded, 
that is, |b{|] < A = constant, for 7, 7 = 1, 2,..., n, then lim B, 


> 00 
exists. The proof is as follows: Consider the matrix B’?. Its elements 
are of the form bib’. Thus each element of B® in absolute value is less 
than nA®. The elements of B? are of the form 6:b3b? which are bounded 
by n?A*. The elements of B* are bounded by n'-!A*. Hence every 
element of B, — E is bounded by the series 


be 1 (nA)* 
> ie ae 
1 


= |] = 





Thus each term of B, converges since each term of B, — Eisaseries bounded 


in absolute value term by term by the elements of the series 1/n » (nA)*/k! 
k=1 

which converges to (1/n)e"4 asr— o. Let the reader define cos B and 

sin B. Is sin? B + cos? B = E? Let the reader also show that 


- 


a Bt — Bt 
ae = Be 


for a constant matrix B. 
Problems 


1. Show that the roots of (1.64) are real if a,, is real and a,, = a,,. Hint: Consider 
AX = dX, assume A = Ay + tho, X = Xi + 7X, and show that A. = 0. 

2. Consider the quadratic form Q = aagi%x5, the z*, ag complex. We may write 
Q = XTAX. If A = AT, show that Q is real by showing that Q = Q. A matrix A 
such that A = AT is called a Hermitian matrix. 

8. If A and B are Hermitian (see Prob. 2), show that AB + BA and 7(AB — BA) are 
Hermitian. 

4. Show that the roots of (1.64) are real if A is Hermitian. 


LINEAR EQUATIONS, DETERMINANTS, AND MATRICES 35 
6. Find an orthogonal transformation which reduces 


Q = 2? + y? + 2? + Qrz + 4 4/2 yz 


to canonical form. 

6. Show that the characteristic roots of the matrix A are the same as those of the 
matrix B~'AB. 

7. Write the system 














Dy 
Mm ae + He = = Ke 
d?y dx _ 
mae He = = () 
in the matrix form 
a?X He , dX 
i me de 
Ee | 
ies r-[7| a-| 9) ela 
y =1 0 0 














Let X = CY, |C| ~ 0, and find C so that the system becomes 

















a*Y He 1 0) y= 2\~ —1 ae 
at? miQ —1 ~ QT ~1 1 
0 
Integrate this system of equations. 
8. Solve the system 
+22 +23 +7, = 0 
M1 — fo $23 — 4 = 0 
2271 + 342 — es t4xy = 7 
32) —~ %2+73 — 74 = 2 


for x21, t2 by first eliminating 23, 24. 
9. Consider the system of differential equations 


d?X ax 
ae Tt A ap + BX =O 
where A and B are constant matrices. Let 


X = evC C constant 


and show that w satisfies |w?5} + wa} + b'| = 0 if C is not the zero vector. 
10. The characteristic equation of A is the determinant |A — AE| = 0. This isa 


polynomial in ), written 
A” + Did"? + bah? + + + > +b, = 0 


See Dickson, ‘‘ Modern Algebraic Theories,’’ for a proof that A satisfies its character- 
istic equation, that is, 


A" + b,A""! + boAt™™? + +--+ +0,E =0 


This is the Hamilion-Cayley theorem. 


36 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


REFERENCES 


Aitken, A. C.: ‘‘Determinants and Matrices,’ Oliver & Boyd, Ltd., London, 1942. 

Albert, A. A.: ‘Introduction to Algebraic Theories,’’ University of Chicago Press, 
Chicago, 1941. 

Birkhoff, G., and S. MacLane: ‘‘A Survey of Modern Algebra,’’ The Macmillan Com- 
pany, New York, 1941. 

Ferrar, W. L.: ‘‘Algebra,’’ Oxford University Press, New York, 1941. 

Michal, A. D.: ‘‘Matrix and Tensor Calculus,” John Wiley & Sons, Inc., New York, 
1947. 

Veblen, O.: “Invariants of Quadratic Differential Forms,’’ Cambridge University 
Press, New York, 1938. 


CHAPTER 2 


VECTOR ANALYSIS 


2.1. Introduction. Elementary vector analysis is a study of directed 
line segments. The reader is well aware that displacements, velocities, 
accelerations, forces, etc., require for their description a direction as well 
as a magnitude. One cannot completely describe the motion of a par- 
ticle by simply stating that a 2-lb force acts upon it. The direction of 
the applied force must be stipulated with reference to a particular coordi- 
nate system. In much the same manner the knowledge that a particle 
has a speed of 3 fps relative to a given observer does not yield all the 
pertinent information as regards the motion of the particle with respect 
to the observer. One must know the direction of motion of the particle. 

A vector, by definition, is a directed line segment. Any physical 
quantity which can be represented by a vector will also be designated as 
a vector. The length of a vector when compared with a unit of length 
will be called the magnitude of the vector. The magnitude of a vector 
is thus a scalar. A scalar differs from a vector in that no direction is 
associated with a scalar. Speed, temperature, 
volume, etc., including elements of the real- [a, a, a, G, a, a | 
number system, are examples of scalars. 

Vectors will be represented by arrows (see Fig. 
2.1), and boldface type will be used to distin- Fic. 2.1 
guish a vector from a scalar. The student 
will have to adopt his own notation for describing a vector in writing. 

A vector of length 1 is called a unit vector. There are an infinity 
of unit vectors since the direction of a unit vector is arbitrary. If a 
represents the length of a vector, we shall write a = |a|. If |a| = 0, we 
say that a is a zero vector, a = 0. 

2.2. Equality of Vectors. Two vectors will be defined to be equal if, 
and only if, they are parallel, have the same sense of direction, and are 
of equal magnitude. The starting points of the vectors are immaterial. 
This does not imply that two forces which are equal will produce the 
same physical result. Our definition of equality is purely a mathematical 
definition. We write a = b if the vectors are equal. Moreover, we 
imply further that if a = b we may replace a in any vector equation by b 

37 






38 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


and, conversely, we may replace b by a. Figure 2.2 shows two vectors 
a, b which are equal to each other. 
a 2.3. Multiplication of a Vector by a Scalar. 
If we multiply a vector a by a real number 2, 
we define the product za to be a new vector 
b parallel to a; the magnitude of za is |x| times 
the magnitude of a. Ifz > 0, va is parallel to 
and has the same direction asa. If x < 0, za 
is parallel to and has the reverse direction of a (see Fig. 2.3). 
We note that 


Fia. 2.2 


(ya) = (xy)a = xya 
Oa = 0 


It is immediately evident that two vectors are parallel if, and only if, 
one of them can be written as a scalar multiple of the other. 


Jf 1. =} 


Fig. 2.3 Fig. 2.4 





2.4. Addition of Vectors. Let us suppose we have two vectors given, 
say a and b. The vector sum, written a + b, is defined as follows: A 
triangle is constructed with a and b forming two sides of the triangle. 
The vector drawn from the starting point of a to the arrow of b is defined 
as the vector sum a + b (see Fig. 2.4). 

From Euclidean geometry we note that 


at+b=bct+ta (2.1) 
(a+b) +c=a-+ (b+c) (2.2) 
z(a + b) = xa + xb (2.3) 


Furthermore, a + 0 = a, (x + y)a = va + ya, andifa = b,c = d, then 
a-+cz=b-+d. The reader should give geometric proofs of the above 


statements. 
Subtraction of vectors can be reduced to addition by defining 


a—-b=z=a-+ (—b) 


VECTOR ANALYSIS 39 


An equivalent definition of a — b is the following: We look for the vector 

c such thatc + b =a. c is defined as the vector a — b (see Fig. 2.5). 
2.5. Applications to Geometry. Let a, b,c, be vectors with a common 

origin O whose end points A, B, C, respectively, lie on a straight line. 


A 





Fig. 2.5 Fia. 2.6 


Let C divide AB in the ratio x:y, x + y = 1 (see Fig. 2.6). We propose 
to determine c as a linear combination of a and b. It is evident that 


c=a+t+ AC. But AC = 2 AB = z(b — a). Hence 
c= (1 — x)a+ zb = ya+ xb (2.4) 


Conversely, let c = ya + 2b, x +y=1. Ifa, b, c have a common 
origin O, we show that their end points lie on a straight line. We write 
c=(1—z)a+zb so that c=a+2(b-— a). Since z(b— a) is a 


vector parallel to the vector b — a = AB, we note from the definition 
of vector addition that the end point 
C lies on the line joining A to B. 
Q.E.D. 

Equation (2.4) is very useful in 
solving geometric problems involv- 
ing ratios of line segments. 


Example 2.1. The diagonals of a paral- 
lelogram bisect each other. Let ABCD be 
any parallelogram, and let O be any point 
in space (see Fig. 2.7). The statement 
c — d = b — a defines the parallelogram 
ABCD. Hence 





Ka +c) = —— = ——— = i(b 4+ d) Fia. 2.7 


The vector (a + c)/2 with origin at O has its end point on the line joining AC [see 
Eq. (2.4)]. The vector (b + d)/2 with origin at O has its end point on the line joining 
BD. There is only one vector from O whose end point lies on these two lines, namely, 
the vector p. Hence P bisects AC and BD. 


40 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Example 2.2. In Fig. 2.8, D divides CB in the ratio 3:1; E divides AB in the ratio 
3:2. How does P divide CE, AD? 
Imagine vectors a, b, c, d, e, p drawn from a fixed point O to the points A, B, C, D, 
E, P, respectively. From Eq. (2.4) we 
CG have 





c + 3b e = 28 + 3b 


Ca 5 


Since p depends linearly on a and d, and 
also on ¢ and e, we eliminate the vector 
b from the above two equations. This 
yields 


4d + 2a 5be +c 





A 


or 
Fig 2.8 4d + 2a = Se + Ge 


The vector ¢d + $a with origin at O must have its end point lying on the line AD. 
Similarly $e + 4c has its end point lying on the line EC. These two vectors have 
been shown to be equal. Then 


p= 7d+%a=%e+ fe Why? 
Thus P divides AD in the ratio 2:1 and divides CE in the ratio 5:1]. 


Problems 
= 
lal 

2. a, b, ¢ are consecutive vectors forming a triangle. What is their vector sum? 
Generalize this result. 

8. aand bare consecutive vectors of a parallelogram. Express the diagonal vectors 
in terms of a and b. 

4. aand bare not parallel. If va + yb = la + mb, show that x = 1, y = m. 

5. Show graphically that |a| + |b] 2 ja + bj, ja — b] 2 | Jal — |b] |. 

6. Show that the midpoints of the lines which join the midpoints of the opposite 
sides of a quadrilateral coincide. The four sides of the quadrilateral are not neces- 
sarily coplanar. 

7. Show that the medians of a triangle meet at a point P which divides each median 
in the ratio 1:2. 

8. Vectors are drawn from the center of a regular polygon to its vertices. Show 
that the vector sum is zero, ~ 

9. a, b, c, d are vectors with a common origin. Find a necessary and sufficient 
condition that their end points lie in a plane. 

10. Show that, if two triangles in space are so situated that the three points of 
intersection of corresponding sides lie on a line, then the lines joining the corresponding 
vertices pass through a common point, and conversely. This is Desargues’ theorem. 


1. Interpret 


2.6. Coordinate Systems. The reader is already familiar with the 
Euclidean space of three dimensions encountered in the analytic geometry 
and the calculus. The cartesian coordinate system is frequently used 
for describing the position of a point in this space. The reader, no 
doubt, also is acquainted with other coordinate systems, e.g., cylindrical 
coordinates, spherical coordinates. 


VECTOR ANALYSIS 41 


We let i, j, k be the three unit vectors along the positive z, y, and z axes 
respectively. If ris the vector from the origin to a point P(z, y, z), then 
(see Fig. 2.9) 

r= zi+ yj + zk (2.5) 

The numbers z, y, z are called the components of the vectorr. Note 


that they represent the projections of the vector r on the z, y, and z axes. 
r is called the position vector of the point P. 





P (a, y, 2) 


mee ee eee ee ee 


Fig. 2.9 


By translating the origin of any vector A to the origin of our cartesian 
coordinate system it can be easily seen that 


A= Aji + Aj + Azk (2.6) 


A, Az, Az are the projections of A on the xz, y, and z axes, respectively. 
They are called the components of A. 

Let us now consider the motion of a fluid covering all of space. At 
any point P(z, y, z) the fluid will have a velocity V with components 
u,v, w. Thus 


V = u(z, y, 2, Hi + v(z, y, z, t)j + w(z, y, 2, t)k (2.7) 


The velocity components u, v, w will, in general, depend on the point 
P(x, y, z) and the time ¢. The most general vector encountered will be 
of the form given by (2.7). Equation (2.7) describes a vector field. 
If we fix ¢, we obtain an instantaneous view of the vector field. Every 
point in space has a vector associated with it. As time goes on, the 
vector field changes. A special case of (2.7) is the vector field 


V = u(z, y, 2)i + v(2, y, z)j + we, y, 2)K (2.8) 


Equation (2.8) describes a steady-state vector field. The vector field 
is independent of the time, but the components depend on the coordinates 
of the point P(z, y, 2). 


42 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The simplest vector field occurs when the components of V are constant 
throughout all of space. A vector field of this type is said to be uniform. 


Example 2.3. A particle of mass m is placed at the origin. The force of attraction 
which the mass would exert on a unit mass placed at the point P(z, y, z) is 


es xi + yj + 2k 
p= Om Ce Fl 











This is Newton’s law of attraction. Note that F 1s a steady-state vector field. 
It is easily verified that if 


A= Aji+ Aj + Axk 
B = Bi+ Boj + Bk 
A+B = (Ai + Bi)i + (Ae + Bo)j + (As + Bs)k 
tA + yB = (tA; + yB)i + (tA + yBo)j + (As + yBs)k 


2.7. Scalar, or Dot, Product. We define the scalar, or dot, product of 
two vectors by the identity 


a+b = |al |b| cos 6 (2.9) 


then 


where @ is the angle between the two vectors when they are drawn from 
a common origin. Since cos 6 = cos (—6), there is no ambiguity as to 
how @ 1s chosen. 
From (2.9) it follows that 
a-b=b-a 
za-yb = rya-b (2.10) 
a-a = jal? = a? 
a=b,c=d implies a-c=b-d 


If a is perpendicular to b, then a-b = 0. Conversely, if a-b = 0, 
|a| + 0, |b| ¥ 0, then a is perpendicular to b. 


a a 


0 0 
L—fa| cos 8 | 


Now a « b is equal to the projection of a onto b multiplied by the length 
of b (see Fig. 2.10). Thus 


a-b = (proj, a)\b| = (proj. b)|a| 


ou 


Fig. 2.10 


With this in mind we proceed to prove the distributive law, 


a-‘(b+c) =a-bt+a-c (2.11) 


VECTOR ANALYSIS 43 
From Fig. 2.11 it is apparent that 


a-(b +c) = [proj. (b + c)]lal 

(proj. b + proj. c)|al 
(proj. b)|a| + (proj, c)|al 
b-a+c:-a 

a-b+a-c 


I 


A repeated application of (2.11) yields 


(at+tb)-(c+d) =a-(c+d)+b:-(c+d) 
=a-cta-d+b-c+b-d 





Fia. 2.11 Fig. 2.12 


Example 2.4. Cosine Law of Trigonometry (Fig 2.12) 


c2 = fh? + a? — 2ab cos 6 
Example 2.5. ici=j-j=k-k =J,i-j=j-k =k:i=0. Hence if 
A = Aji + Aj + Ask 
B = Bi + Boj + Bak, then 
A-B = A,B, + AeBo + AsBs (2.12) 


Formula (2.12) is very useful. It should be memorized. 

Example 2.6. Let Ox'x*x3 and O£f'Z£3 be orthogonal rectangular cartesian coordi- 
nate systems with common origin 0. We now use the superscript convention of 
Chap. 1. The unit vectors along the x’ axes, 7 = 1, 2, 3, are designated by i,. Simi- 
larly i;, 7 = 1,2, 3, is the set of unit vectors along the axes. Let a® be the cosine of the 
angle between the vectors ig, ig, a, 8 = 1, 2,3. The projections of i; on the £1, £%, £8 
axes are aj, a?, ai, respectively. Hence i; = ajii + aji2 + atis = a@i,. Let the 
reader show that 

in = afig (2.13) 


We have (Fig. 2.13) r = f, so that 2°i, = £%y. From (2.13) a8x%ig = Fig so that 


#8 = goya Bg =1,2,3 (2.14) 


44 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Equation (2.14) represents the coordinate transformation (linear) between our two 
3 3 


coordinate systems. In matrix form, X = AX. From > ea = > = it follows 
1=] +=1 
that X7X = X7X, which in turn implies ATA = E or A? = AW), 

If U', U2, U8 are the components of a 
vector when referred to the x coordinate 
system, and if U1, U2, U* are the components 
of the same vector when referred to the # 
coordinate system, one obtains 


U8 = g®Ua (2.15) 


This result is obtained in exactly the same 

manner in which Eq. (2.14) was derived. 
zB 

From (2.14), aa = a8 go that (2.15) may be 


written 


T 





Q 


x 
one 


je = 


Ue (2.16) 


Q 


Fia. 2.13 


Equation (2.16) will be the starting point for the definition of a contravariant vector 
field (see Chap. 3). 


Problems 


1. Add and subtract the vectors a = 2i — 3j + 5k, b = —2i + 2j + 2k. Show 
that the vectors are perpendicular. 

2. Find the angle between the vectors a = 2i — 3j + k, b = 3i — j — 2k. 

3. Let a and b be unit vectors in the zy plane making angles a and 6 with the =z axis. 
Show that a = cos ai + sin a j, b = cos Bi + sin Bj, and prove that 


cos (a — 8) = cos a cos B + sin a sin B 


4. Show that the equation of a sphere with center at Po(2o, yo, 20) and radius a is 
(x — Xo)? + (y — Yo)? + (2 — 20)? = a?. 

5. Show that the equation of the plane passing through the point Po(xo, yo, 20) nor- 
mal to the vector Ai + Bj + Ck is 


A(z —2z9 + Bly — yo) + Cle — 2) = 0 


6. Show that the equation of a straight line through the point Po(zo, yo, 20) parallel 
to the vector li + mj + nk is 


L=x+t+NA Y = yo + AN Z2= 2) + rn —-ex <A w 


7. Prove that the sum of the squares of the diagonals of a parallelogram is equal to 
the sum of the squares of its sides. 
8. Show that the shortest distance from the point Po(x0, yo, Zo) to the plane 


Az + By +Cz+D=0 
a Azo + Byo + Czo + D 
(A? + Bt + C2) 


VECTOR ANALYSIS 45 


£878. Show that this implies 


Ms 


3 
9. For Eq. (2.14), > zbz8 = 
B=1 B 


3 
> abas 
B=1 
10. For the aj of Prob. 9 show that if U* = agU8, Va = afV8, then 


0 ifa xy 
1 ifa=y 


2.8. Vector, or Cross, Product. One can construct a vector c from two 
given vectors a, b as follows: Let a and b be translated so that they have 
a common origin, and let them form 
the sides of a parallelogram of area 

= |a| |b| sin 6 (see Fig. 2.14). We 
define c to be perpendicular to the 
plane of this parallelogram with 
magnitude equal to the area of the 
parallelogram. The direction of c 
is obtained by rotating a into b 
(angle of rotation less than 180°) and 
considering the motion of a right- 
hand screw. The vector c thus obtained is defined as the vector, or 
cross, product of a and b, written 


=a Xb = |al |b/ sin 0 E (2.17) 


with [E| = 1,E-a=E-b = 0. 
The vector product occurs frequently in mechanics and electricity, but 
for the present we discuss its algebraic behavior. It follows that 





Fig. 2.14 


axb= —-bxXa 


so that the vector product is not commutative. Ifa and b are parallel, 
axXb=0. Conversely, if a X b = 0, then a and b are parallel pro- 
vided lal + 0, |b| ~ 0. In particular a x a = 0. 
The distributive law, a X (b+c)=aXb+axXec, can be shown to 
hold as follows: Let 


u=axX(bt+c)—axb—axc 


We attempt to show that u = 0. Since the distributive law for scalar 
multiplication holds, we have 


veu=v-aX (b+c) —Vv-(aX b) —v: (a XC) (2.18) 


46 ELEMENTS OF PURE AND APPLIED MATHEMATICS 

for arbitrary v. In the next paragraph it will be shown that 
a-(b Xc) = (aX b)-c 

Thus (2.18) may be written 


(vx a)-(b+c) —(vXa)-b—(vVXa)-c 
(vxa)-b+(vxXa)-c—(vxXa)-b—(VXa)-c 
= 0 


V-u 


This implies that u = 0 or v Lu. Since v can be chosen arbitrarily, 
and hence picked not perpendicular to u, it follows that u = 0 so that 


ax(b+c)=~axb+axc (2.19) 


Ezxample2.7. Oneseesthati Xi=0,jXj=0,k Xk =0,ixj=k,jxXk =i, 
k Xi=j. For the vectors a = ai + a.j + ask, b = b:i + boj + bsk we obtain 
a Xb = (acs — asbe)i + (asd; — aibs)j + (aibe — aeb1)k from the distributive law. 
Symbolhcally 


ij k 
a x b = 108, GA, a3 (2.20) 
b; be bs 








Equation (2.20) is to be expanded by the ordinary method of determinants. 
Example 2.8 
a =i — 3j + 2k b = 4i+ j — 3k 


i j k 
axXb=/1 —-3 2| = 7i + 11j + 13k 
4 1 —3 





(aXb)-a=7 — 33 + 26 =0 (aX b)-b = 28 +11 —39 =0 


Example 2.9. Rotation of a Particle. Assume that a particle is rotating about a 
fixed line Z with angular speed w. We assume 
that the shortest distance of the particle from 
L remains constant. Let us define the angu- 
lar velocity of the particle as the vector, wo, 
whose direction is along LZ and whose length 
is w. We choose the direction of w in the 
usual sense of a right-hand-screw advance 
(see Fig. 2.15). Let r be the position vector 
of P with origin O on the line L. It is a sim- 
ple matter for the reader to show that the 
velocity of P, say V, is parallel to, and has the 
same magnitude as,w Xr. Thus V =o Xr. 
Example 2.10. Motion of a Rigid Body with 
One Fixed Point. Let Oxyz be a fixed coordi- 
nate system, and Of%2 a coordinate system 
attached to the rigid body whose fixed point 
Fig, 2.15 is the origin O. Let P be a point of the rigid 
body. As time progresses, the coordinates 
£, 7, 2 remain constant since the O22 coordinate system is rigidly attached to the 
moving frame. From (2.14) we have 2* = aiz’. Hence 





VECTOR ANALYSIS 47 





dz da} . da 
—_— = 4 q + — 
poe a Pe 
dx* da’ da‘ 
so that — = —A* 2 z!, Akat = 6%, or ||A|| = |la}||-!. If we define w¥ = — A} eal) 
dt dt dt 
we have 
- = wry? (2.21) 
3 3 re 
However, y rkx* represents the invariant distance from O to P so that > x a 0 
k=1 k=1 


and > writ = 0), From Example 1.2 it follows that wf = —w}. Equation (2.21) 


k=1 
can now be written as 


dx} 

an wir? + wiz? = wir? — w?7? 
dx? 

Se ee ea 2 i EN cs et 
dm + wx? = wir wor 
dx3 

a wit! + war? = wit? — wa! 


1 2 p83 
so that ve i i+ Sk moxy, © = wii + wij + wok. It follows from 


the result of Example 2.9 that the motion of a rigid body with one point fixed can be 
characterized as follows: There exists an angular velocity vector w whose components, 
in general, change with time, such that at any instant the motion of the rigid body is 
one of pure rotation with angular velocity w. This property is very important in the 
study of the motion of a gyroscope. It can easily be shown that the most general 
rigid-body motion consists of a translation plus a pure rotation. 


2.9. Multiple Scalar and Vector Products. The triple scalar product 
a-(b X c) has a simple geometric interpretation. This scalar represents 
the volume of the parallelepiped formed by the coterminous sides a, b, c, 
since 


a-(b Xc) = |a| |b| |[c| sin 6 cosa 


hA = volume 


where A is the area of the parallelogram with sides b and c and h is the 
altitude of the parallelepiped (see Fig. 2.16). It is easy to see that 
(a X b)-c represents the same volume. Hence it is permissible to inter- 
change the dot and cross in the triple scalar product. Since there can 
be no confusion as to the meaning of a- (b X c), it is usually written as 
(abc). The expression a X (b-c) is meaningless. Why? We let the 
reader prove that 
GQ, Ge a3 
a-b xX c= (abc) = /|b1 be Ds (2.22) 


Ci Co C3 


48 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
From determinant theory, or otherwise, it follows that 
(abc) = (cab) = (bca) = — (bac) = —(cba) = —(acb) 


It is easy to show that a necessary and sufficient condition that a, b, c 
lie in the same plane when they have a common origin is that (abc) = 0. 
In particular, (aac) = 0. 


Fia. 2.16 


The triple vector product a X (b X c) is an important vector. It is 
certainly a vector, call it V, since it is the vector product of a and b X c. 
We know that V is perpendicular to the vector b X c. However, b X c 
is perpendicular to the plane of b and c so that V lies in the plane of b and 
c. Ifbandcare not parallel, then V = Ab + uc. If bandcare parallel, 
V=0. Since V-a = 0, we have \(b- a) + p(c- a) = 0 So that 


V = A,[(c: a)b — (b- ajc] 
It can be shown that \, = 1 so that . 
a X (b Xc) = (a-c)b — (a- bic (2.23) 
The expansion (2.23) of a  (b X c) is often referred to as the rule of the 
middle factor. Similarly 


(aX b) Xc = (a-c)b — (b-ch)a (2.24) 


Lf 

More complicated products can be simplified by use of the triple 
products. For example, we can expand (a X b) X (c X d) by consider- 
ing a X b as a single vector and applying (2.23). 


(aX b) X (c X d) = (aX b- d)c — (a X b-c)d 
= (abd)c — (abc)d 


VECTOR ANALYSIS 49 


Also (aX b)-(c X d) = [(a X b) X ec] -d 
= [(a:c)b — (b-c)a]-d 
= (a-c)(b+d) — (b-c)(a-d) 
_ja-c a-d 
=|he b-d 


Example 2.11. Consider the spherical triangle ABC (sides are arcs of great circles) 
(see Fig. 2.17). Let the sphere be of ra- 
dius 1. Now 


(axXb)°-(axXc) =b-c — (a: by(a-c) 


sincearca =1. Theangle between a X b 
and a X cis the same as the dihedral angle 
A between the planes OAC and OAB, since 
a X bis perpendicular to the plane of OAB 
and since a Xc is perpendicular to the 
plane of OAC. Hence 


sin y sin 8 cos A = cos a — cos y cos B 
Problems 


1. Show by two methods that the vectors 
a = 3i — j + 2k, b = —12i1 + 4j — 8kare 
parallel. 

2. Find a unit vector perpendicular to the Fig. 2.17 
vectors a =i — j + 2k,b = 3i+j — k. 

8. If a, b, c, d have a common origin, interpret the equation (a X b)+ (c X d) = 0. 

4. Write a vector equation which specifies that the plane through a and b is parallel 
to the plane through c and d. 

5. Show that d & (a X b)° (a Xc) = (abc)(a- d). 

6. Show that (a Xb): (b X c) XK (c KX a) = (abc)?. 

7. Show that a X (b Xc) + bX (¢c Xa) +c XK (a XD) = Oz 

8. Find an expression for the shortest distance from the end point of the vector ni, 
to the plane passing through the end points of the vectors ro, r;, rs. All four vectors 
have a common origin O. 

9. Assume (abc) ~ 0. Let d = za + yb + 2c. Show that 











_ (dbc) _ (adc) _ (abd) 
7 (abc) "~*~ (abc) *~ (abc) 
10. If (abc) + 0, show that 
cd aed b-d 
= (abey oe  ahey abe 


11. Consider the system of equations 


Aix + by + C\y2 = dy 
Aor + bey + Coe = do (2.25) 
asx + bsy + c3z = ds 


00 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Let a = aii + a2j + ask, etc., and show that (2.25) can be written as 


ar + by +ez =d 


Show that 7 = ‘ake » etc., (abc) ~ 0. 





2.10. Differentiation of Vectors. Let us consider the vector field 
u= A,(2, Y; 4; t)i a A(z, Y, 2, t)j zs A,(z, Y, 2, t)k (2.26) 


At any point P(x, y, z) and at any time t, (2.26) defines a vector. If we 
keep P fixed, the vector u can still change because of the time dependence 
of its components A, Az, As. If we keep the time fixed, we note that 
the vector at P(z, y, z) will, in general, differ from the vector at Q(x + dz, 
y + dy, z+ dz). Now, in the calculus, the student has learned how to 
find the change (or differential) of a single function of z, y, z, t. What 
difficulties do we encounter in the case of a vector? Actually none, since 
we easily note that u will change (in magnitude and/or direction) if and 
only if the components of uchange. The vectors i, j, k are assumed fixed 
in space throughout the discussion. We are thus led to the following 
definition for the differential of a vector: 


du = Maen (2.27) 
oa 





io dy + 2Ssa2 48 at i=1,2,3 





If x, y, z are functions of ¢, then 
du _ daA;. , dAz, , dAs 
a det ae tae 
dA, OA,dx , dA,dy , 0A, dz an i ok 
di dz dt ' dy dt ' dz dé pia, Seay 


In particular, let r = zi + yj + zk be the position vector of a moving 
particle P(x, y,z). Then _ 


k (2.28) 














where 


_ dr dz dy. , dz 
v= a ay i+ — a? j+ a (2.30) 
dv dr a" ; 
and =7- gp ae + OY; a <a k (2.31) 


Equations (2.30) and (2.31) are, by definition, the velocity and accelera- 


tion of the particle. 
If a vector u depends on a single variable t, we can define 


du _ yp, Ue + Af) — ult) (2,32) 
dt ae-+0 t 


(see Fig. 2.18). 


VECTOR ANALYSIS 51 
u(t+A t) Au=u(t+At)-u(t) 


u(é) 
Fig. 2.18 


It is easy to verify that (2.32) is equivalent to (2.28). If 


u = u(z, y, z, .- -) 
then 
ou . u(x+Az,y,z,...) -— u(r, y,z,.. .) 
—_—_ = lim a Suet aed 
Ox Azxr—0 Ar 
du OA,. , OA. , OA3 ’ 


2.11. Differentiation Rules. Consider 


g(t) = u(t) - v(¢) 
g(t + At) — g(t) = u(t + At)- v(t + At) — ul?) - viz) 
(u(t) + Au) - (v(é) + Av) — u(t) - v(t) 
u-Av +v-Au + Au: Av 


et + At)— of) _| Av, Au Av 
Hence = Ai =u Al +v A + Au At 
dg _ ot +4t)-eo® _, dv du 
ane = Ai ae ee 
d dv du 
or te) Se (2.34) 


Equation (2.34) also can be easily obtained by writing u and v in com- 


3 


3 3 

_ dp _ dv, du, 

ponent form. Thus ¢ = : Wir, a = » Us + » Ya 
2=1 


tz] =] 


dp _ dv du 
a “at a 
Similarly 

d dv . du 

yuxv=uxT +a xv (2.35) 
d _ ,du , df 
Example 2.12. Let ube a vector of magnitude u. Then u-u = u? so that 
du du 

u° Ti = U dt (2.37) 


This is a very useful result. In particular, if the magnitude of u remains constant, 


d 
ou 0 and w- aU _ 0. This implies, in general, that t 


7 is perpendicular to u, if 


52 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


|u| = constant and - x 0. The reader should give a geometric proof of this 


statement. 
Example 2.13. Motion in a Plane. Let r be the position vector of a point P 


moving in a plane having polar coordinates (r, 6) (see Fig. 2.19). Nowr =7R, Ra 
unit vector. Since R = cos 6i-+ sin 6j, we have P = ae = — gin Oi + cos 6 j. 


Why is P perpendicular to R? Notice that P is also a unit vector. Differentiating 
r = 7R yields 


dr dr dr dR dé 
va aR tra ai + Ge at 
dv d?r drdRdé , drdé d26 do dP dé 
eae a-ak tadat aa’ + am? tT aad 


so that the acceleration of the particle is 


d?r dé\ 2 lad dé 
a=([Ti-? a) |R+3% 2G)? (2.38) 
dP . Se : ; 
since —- = — cos 6i — sin 6j = —R. If the particle moves under the action of a 


dé 
central force field, f = fR, then 5 G a) = 0 so that $r? “ = h =constant. The 


sectoral area swept out by the particle in time dt is dA = gr? dé, so that ae = h, and 


equal areas are swept out in equal periods of time. This is Kepler’s first law of 
planetary motion. ' 


y 





Fic. 2.19 Fig. 2.20 


Example 2.14. Frenet-Serret Formulas. A three-dimensional curve, IT, in a 
Euclidean space can be represented by the locus of the end point of the position 


vector given by 
r(t) = x(t)i + y()j + z@)k (2.39) 


where ¢ is a parameter ranging over a set of valuesto S$ ¢ S$ ti. If 8 is arc length along 


2 2 2 
the space curve, then oe tte = ] from the calculus. It is natural 


ds ds 


to define a as the unit tangent vector to the space curve I (see Fig. 2.20). Since 


VECTOR ANALYSIS 53 


t= e is & unit vector, S is perpendicular to t. Moreover, a tells us how fast the 


direction of t is changing with respect to arc length s. Hence we define the curvature, 
dt dt 
ds ds 
varies from point to point. The principal normal vector to the space curve I is 


x, of the space curve I by x? = x2, in general, 1s a function of s, and hence x 


defined to be the unit vector, n, parallel to o. Thus 


= = «Nn (2.40) 


The reciprocal of the curvature is called the radius of curvature, p = 1/x. At any 
point P on TI we now have two vectors t and n at right angles to each other. This 
enables us to set up a local coordinate system at P by defining a third vector at right 
angles to t and n. We define as the benormal the vector b =t Xn. The three 
fundamental vectors t, n, b form a trihedral at P; any vector associated with the 
space curve I can be written as a linear combination of t, n, b. 

db 


db dn 7 dt _ 
Let us now evaluate ds and a From b:t = O we obtain As t+b ae 0 or 
db 


ds t= Osinceb:n=0. Hence o is perpendicular tot. Since ° is also perpendic- 


ular to b (b is a unit vector), we see that ae must be parallel to n. Consequently 


ds 
o = 7n, where 7 by definition 1s the magnitude of ° 7 is called the torsion of the 
curve ©. To obtain <n, we note that n = b X tso that 
dn dt , db 
a ~>XaZ tg http Km tm xXt= —xt — rb 


The famous Frenet-Serret formulas are 


dt 
ds 
dn 
‘ds 
db 
ds 


= Kn 


— (xt + rb) (2.41) 
=7mn 
As an example of (2.41) we consider the circular helix given by 


r=acosti+asintj + btk 


We have t =f = (—asinti + acostj + bk) é. From t+ t = 1 we obtain 


1 = (a? + Bb?) (2) 
so that 
t = (—asin ti + a cos tj + bk) (a? + b?)-3 


Thus cn = S a (—a cos ti a asin t j) (a? + b?)-1, and K= a(a? “+ b2)-1) since kK = Sa 


54 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


From b = t Xn let the reader show that b = (bsinii — b cost j + ak)(a? + b*)-4, 
“ =7n = (bcosti + b sin ¢t j)(a? + b?)-1,7 = b(a? + b*)71. 

Example 2.15. Pursuit Problems. Let us consider the problem of a missile M 
pursuing a target 7’, the motion taking place 
in the zy plane. rr and Vz are the posi- 
tion and velocity vectors of the target; ru 
and Vw are the position and velocity vec- 
tors of the missile; 6, y, y are defined by 
Fig. 2.21. 


Let r = rr — ry So that S- Vr — Vay 





dr dr 
and ea Vr —r'Vm. Thus ra 
= rVr cos ¢ — rv cos 6, and 
Fig. 2.21 S = Vrcos yg — Vycos6 (2.42) 


Differentiating the identity r- Vr = rV7 cos ¢ yields 


pr + (Vr _ Var) ° Vr = v,2 at (r cos ¢) + T COs eat (2.43) 


If t is the unit tangent vector to the curve I traversed by the target, then Vr = Vrt 


dVr es dt adVr _dVr dy 
“a ae tt rg 8 that Gm tt Vr 


Example oi. Equation (2.43) becomes 


n since S = Sl (see 
di di 


dVr dy . : 
ry C08 y ~ rVr Hein ¢ + Vi — VuVr cos (6 — ¢) = Vr 5 (r cos ¢) 


+r cos ¢ ot (2.44) 
For the special case of constant target speed, a = 0, (2.44) becomes 
dy . 2 d 
—rVr Hy Sine + Vi — VauVr cos (¢ — 0) = Vr dt (r cos ¢) (2.45) 


Let us apply (2.42), (2.45) to the dog-rabbit problem. At? = 0 the rabbit starts at 
the origin and runs along the positive y axis with constant speed Vr. A dog starts at 
(a, 0) at ¢ = O and pursues the rabbit in such a manner (direct pursuit) that 6 = 0 
throughout the motion. The constant speed of the dog is Vy. We have 


Se Vacs VG 


a g (2.46) 
Vi — VuVrcosg = Vr at (r cos ¢) 
since y =2/2. Equations (2.46) can be written as 


d d 
Vr 5 (r cos y) + Vn a = Vi — Vi 


Integration yields Vrr cos ¢ + Vur = (V2. — Vi,)t + Vua, since r =a, 9 = —x/2, 
att =0. The rabbit is caught when r = 0, or att = Vua(Vi, — V2)~. 


VECTOR ANALYSIS 55 


Problems 
1. Prove (2.35), (2.36). 
d ar a’r 
2. Show that rat x a) = r x 


8. r = acos wit + bsin wi; a, b, w are constants. Prove thatr X i = wa X band 


4. Risa unit vector in the direction r,7r = “ Show that R X dR = ixe 


5. If 8 ew xa, =aXxb, show that = (a x b) = a X (a Xb). 
6. If r = ae** + be, a, b, w constants, show that ore wr = Q. 


dt? 
7. Consider the differential equation 


d*u du 
az +244 + Bu =0 (2.47) 
A, B constants. Assume a solution of the form u(t) = Ce, C a constant vector, wa 
constant scalar. Show that u(t) = Cie": + Cree! is a solution of (2.47) where wi, we 
are roots of w? + 2Aw+B=0. What if wi = w2? 

dus d*u du du 
8. Find a vector u which satisfies —; 78 a2 7 2— toe such that u = i, ae i, 


d?y 
Sa = kfort = 0. 


9. For the space curve x = 3t — t?, y = 30?, z = 3¢ + é? show that 
K=T= g(1 + {?)-?2 


d'r dx 
10. Show that asi = —«*t + ae n — «rb. 


11. Four particles on the corners of a square (sides = b) begin to move toward each 
other in a clockwise fashion in a direct-pursuit course. Each has constant speed V. 
Show that they move a distance b before contact takes place. 

12. In navigational pursuit Vy sin 6 = Vrsing. Interpret this result geo- 
metrically, assuming @, ¢ constants. , 

18. A target moves on the circumference of a circle with constant speed V. A 
missile starts at the center of this circle and pursues the target. The speed of the 
missile is also V. The pursuit is such that the center of the circle, the missile, and the 
target are collinear. Show that the target moves one-fourth of the circumference up 
to the moment of capture. 


2.12. The Gradient. Let ¢(z, y, z) be any differentiable space func- 
tion. From the calculus 


dp = dr + fay + F a ae (2.48) 


The right-hand side of (2.48) suggests that the scalar product of two 
vectors might be involved. Ifr = zi + yj + zk is the position vector of 
the point P(x, y, z), then dr = dxi+dyj-+dzk. Hence, to express dy 


, 


56 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


as a scalar product, one need only define the vector with components 


de dg dg ; : ; : 
aE By? be This vector is called the gradient of g(x, y, 2), written 
del ¢ = Ve. We define 
= 20; 4 8, , de 
Vo = ae + ay) + a, (2.49) 
so that 
de = dr-Ve (2.50) 


The reader should recall from the calculus that dg represents the change 
in ¢ as we move from P(z, y, z) to Q(z + dx, y + dy, z + dz), except 
for infinitesimals of higher order. Equation (2.50) states that this 
change in ¢ can be obtained by evaluating the gradient of » at P and 
computing the scalar product of Vy and dr, dr being the vector from 
P to Q. 

We now give a geometrical interpretation of Vy. From ¢(z, y, z) we 
can form a family of surfaces g(x, y, z) = constant. The surface S 
given by o(z, y,2) = ¢(xo, Yo, 20) contains the point P(x, yo, 20). (2, y, 2) 
has the constant value ¢(2o, yo, 20) if we remain on this particular sur- 
face S. Now let Q be any point on S near P. Since dy = 0, we have, 
from (2.50), dr- Ve = 0. Hence V¢ is perpendicular to dr. Thus Vg, 
at P, is normal to all possible tangents to the surface at P so that V¢ neces- 
sarily must be normal to the surface ¢(z, y, 2) = ¢(2o, Yo, 20) at P(zo0, Yo, 20). 
This is a highly important result and should be thoroughly understood 
by the reader. 

Let us now return to (2.50). If ds = |drj, (2.50) states that 


dg _ adr 


ee ee (2.50’) 


where |u| = 1. Hence oe = |Vy| cos 6, where @ is the angle between 


uand Vy. It is obvious that oe has its maximum when @ = 0, 


The greatest change in ¢(z, y, z) at P(xo, yo, 20) occurs in the direction of 
Ve, that is, the greatest change in ¢ occurs when we move normal to the 
surface g(x, y, 2) = (Xo, Yo, 20). This is to be expected. Let the reader 


show that V(¢1 + 2) = Voi + Veo. 
Example 2.16. Let us find a unit vector perpendicular to the surface 


z*— ay tyz=l 


VECTOR ANALYSIS 57 
at the point P(1, 1,1). Here ¢(z, y, z) = x? — zy + yz, and 


= (27 — y)i + (2 — x)j + yk 
=i+kat P(, 1, 1) 
Ve _it+k 
N = 
Vol 4/8 


Example 2.17. By direct computation Vr =r/r for r = (2? + y? + 2%)3. We 
obtain this result by a different method. The surface r = constant is a sphere with 
center at the origin. Since Vr is perpendicular to the sphere, Vr = fr. Now 





dr = Vr°dr = fredr = frdr 


so that f = 1/r. Q.E.D. 
Example 2.18. Consider Vf(u), u = u(x, y, 2). We have 


viu) = Zi i+ fj $2 apy Mit ga) % “j+ rms 


where f’(w) = 4, Hence Vf(u) = f’(u)Vu. 


Example 2.19. The operator del, V = i= + j e + k= 5; 38 useful concept. 


It is helpful to keep in mind that V acts both as a sees operator and as a vector 
in some sense. 


Thus V¢ - (iz i +iz tke 
V(Cig1 + Cove) = CiVer + ss if C), ae are eh ene Let the reader show that 


ist P45 sf at k sf. It is easy to show that 


Vigigr) = giVee + o2Ve1 (2.51) 


Notice how (2.51) conforms to the rule of calculus for the derivative of a product. 


Problems 


1. Find the equation of the tangent plane to the surface zy — z = 1 at the point 
(2, 1, 1). 

2. Show that V(a-r) = a, where a is a constant vector. 

3. If r = (2? + y? + 2?)3, show that Vr" = nr™—*r. 

4. If ¢ = (r X a): (r Xb), show that Ve = b X (r X a) + a XK (x Xb) when a 
and b are constant vectors. 

6. Show that the ellipse 7: + r2 = c, and the hyperbola 71 — re = Ce intersect at 
right angles when they have the same foci. 

6. Find the change of ¢ = x®y + yz? — zz in the direction normal to the surface 
yx? + zy? + zy = 3 at the point P(1, 1, 1). 

7. Prove (2.51). 


8. If f = f(ui, U2, . - « » Un), Ue = Ue(Z, y, 2), k = 1,2, ... , 1, show that 
of 
0 
= » aun Ye 
k=l 
dey _ 9% dr 
9. If o = o(z, y, 2, t), show that — aE Tar + Fi 


58 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


10. The equation of an ellipse is r; +r, = constant. Why is V(r. + 1r2)°T = 0 if 
T is a unit tangent to the ellipse at the point P? V(ri + rz) is computed at P. From 


r to . Sk : 
Viri t+ re) = Vri + Vre, Vr = 7 Vre = = give a geometric interpretation to 
1 2 


Vir1 t7r2)*T =O 


2.13. The Divergence of a Vector. Let us consider the motion of a 
fluid of density p(z, y, z, t), the veloc- 
ity of the fluid at any point being given 
as V = V(x, y, z, ¢). Let 


f= pV = Xi+ Yj+ Zk 


We now concentrate on the flow of 
fluid through a small parallelogram 
ABCDEFGH (Fig. 2.22) of dimensions 
dz, dy, dz. At time ¢ let us calcu- 
late the amount of fluid entering the 
box through the face ABCD. The 
x and z components of the velocity contribute nothing to the flow through 





Fic. 2.22 


ABCD. Now Y(a, y, z, t) has the dimension a M = mass, L = length, 


T = time. Thus Y dr dz has the dimension 4/7! and denotes the gain 
of mass per unit time by the box because of flow through the face ABCD. 


Similarly (y + a iy) dx dz represents the loss of mass per unit time 
because of flow through the face EFGH at the same time ¢. The loss 
of mass per unit time is thus = dx dy dz. If we also take into consider- 
ation the other faces of the box, we find that the total loss of mass per 
unit time is 


ax  odY , dZ 


Hence ox + i + of is the loss of mass per unit time per unit volume. 


The scalar “x + + ud is called the divergence of the vector field f, 


written 
ox , OY . dZ (2.53) 


OW on oy TOE 


.d ,.90 a 
Returning to the operator V = i ae + j ay +k 5p’ We note that 


VECTOR ANALYSIS 59 
vet=(i2ojl ap ®)- cs yj + me 
Ox Oy Oz 
OX OY , dZ 
provided we interpret V as both a vector and a differential operator. Let 
the reader show that 
V:-(¢f) = pV-f£+ Ve-f (2.55) 
Example 2.20. Forr = zi + yj + zk, Ver = 3. Forf = r74, 
Vefe=rver t+ Vr ter = 3r-? — 3r-“4*Vrer = 3r7-? — Br rer = 0 
(see Example 2.17). 
Example 2.21. What is the divergence of a gradient? 
=v. (9%; 4 9%; 4 9 
v-(e) =v (S21 4+ S25 4+ 32x) 


ae , ae ay 
Ox? ay? dz? 


This important scalar is called the Laplacian of ¢(2, y, z). 


Q? 0? 0? 
Lap ¢ = V-: (V¢) gO a hoa (2.56) 


2.14. The Curl of a Vector. Let v,, 7 = 1, 2, 3, be the components of 
the velocity of a fluid in an z!x?z* Euclidean coordinate system. The 
differential change in the components of v is given by 


Ov; é Ov; 
du, = 551 OX + Ot dt 


= 1 Ov; 0; , 1 dv, dv; an 
2 (2 = a) eo (3 su) dxi + sp dt = (2.57) 








Ov; Ov; - . 
api ~ qi vd = 1, 2,3, now occupy our attention. The 
&,; are the elements of a skew-symmetric matrix. As a result there are 


three important elements, listed as 





The terms s,, = 


og OU es 
+ Ox? ox’ 
Ov ov 

2 = an — ant (2.58) 
fy = ot _ av 
e On! Ax? 


The vector ¢,i + tej + tsk is defined and called the curl of v. Using the 


60 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


V operator, we note that the curl of v may be written 
ij k 
0 (nn) 
culv=VXv= a. ay az (2.59) 


V1 Ve Da 


Example 2.22. Let f = x*yzi — 2ryz?j + y?zk. 





i j k 

0 0 0 

Vv =| — a 
at Ox OY Oz 
xyz —Qryz? yrz 





= (Qyz + 4xzyz)i + r2yj — (2yz? + x°z)k 


Example 2.23. Consider V XX (gf). We have, for f = ui + vj + wk, 





i jk 
a 0 0 
VMAS | oe ay Be 
gu gy pw 
(i j k ij k 
_ 0 0 0 dg ag I~ 
“lac ay a2) + |ax ay dz 





s 


v 72) U v 


uU 
= ~V Xf + (Vo) Xf (2.60) 


To obtain the curl of gf, keep ¢ fixed, and let V operate on f, yielding yV &X f; then keep 
f fixed, and allow V to operate on ¢, yielding V¢, and complete the vector product 
(Ve) Xf. The sum of these operations yiclds V X (¢f). 

Example 2.24. The curl of a gradient is zero. 


i j) &k 
0 0 0 
VX (Vo) =|dr dy a 

0g Og OdO¢g 

ax oy 


ras) + * (seoy ~ aya 
‘a 5 — 02 Ox oy ody OZ 














i(g% Oz - sr a 
= 0 


provided ¢ has continuous mixed second derivatives. 


2.15. Further Properties of V Operator. We define the product u-V 
to be the scalar differential operator 


te + Uy 5 + u, = (2.61) 
We then have 
vv = ov 
(u- v= tea tye + te 5 


VECTOR ANALYSIS 61 


Let us now investigate u X (V X v). Assuming that we can expand 
this term by the rule of the middle factor [remember that the expansion 
a X (b X c) = (a:c)b — (a- b)c holds true only for vectors; V, strictly 
speaking, is not a vector], we have 


uX (V Xv) = V,(u-v) — (u-V)v (2.62) 


The subscript v in the term V,(u-v) means that V, operates only on the 
components of v. Thus 


Vi(uev) = Voltas + Uyry + ters) 


ne Ov, Ory Ov,\. sete 
= (ue + u Uy a + Us oe) 


Interchanging the role of u and v yields 
vx (V Xu) = V,(u-v) — (v-V)u (2.63) 
Adding (2.62) and (2.63) results in 


Viluev) + V,o(u-v) =uxX(VxXv+tvxKX (VxXut+(u-V)Vv 
+(v-V)u 


and 
Viuev) =ux (VXv+tVX(VxXut+(u-Viv+(v-Vyu (2.64) 


The above analysis in no way constitutes a proof of (2.64). We leave 
it to the reader to verify (2.64) by direct expansion. The same remarks 
hold for the following examples: 


Vx (uxv=V.X (uxvt+V, X (u xX v) 
=(v-V)u—v(V-u) + (u-V)v — Y- VA (2.65) 
V-(uxXv) =V.u-(u Xv) t+ V.- (u X Vv) 
=(V Xu)-v—V,°(v xX wu) 
=(VXu)-v—(VXv)-u (2.66) 


We now list some important identities: 


(1) V(uv) = uVo + vVu 

{2) V- (gu) = gV-uct (V¢)-u 

(3) VX (gu) = pV Xx u+ (Ve) Xu 

{4) V X (Ve) = 0 

5) V-(V xu) = 0 

(6) V-u xv) =(VXu-v—(VXxXVv)- 

(7) VX (uXv) =(v-V)ut (u- ae u) — u(V-v) 
(8) Viu-ev) =ux(VxXvt+vx (V Xu) + (u-Vivt+ (v-V+)u 
(9) VX (V xu) =V(V-u) — Vu 

(10) (u-V)r=u 

(11) V-r =3 


62 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
(12) Vxr=0 
“(13) de = dr- Ve + oF al 


v (14) dt = (dev + Sa 


(15) V-(r-*r) = 0 
Problems 


Show that the divergence of a cur] is zero. 

. Find the divergence and curl of zi — yj/r + y, of r cos zi + yln xj — zk. 
If A = ari + byj + czk, show that V(A°+r) = 2A. 

Show that V2(1/r) = 0, r = (2? + y? + 22)}. 

Show that V X [f(r)r] = 0. 

If f = wVv show that f-V Xf = 0, u not constant. 

. Show that (v°V)v = 5Vv? —vX(V Xv). 

. If Ais a constant unit vector show that A: |V(A-°v) — UV X (v X A)] = V-v. 
. If w is a constant vector, r = zi + yj + zk, show that V X (w Xr) = 2w. 

10. Show that V: (u Xv) = (VX u)-v— (V Xv)eu. 

11. Show that df = (dr-V)f + oF al if f = f(z, y, 2, é). 

12. Show that V X (V Xu) = V(V-eu) — V2u. 

18. Let u = u(z, y, 2), v = v(z, y, 2), and assume Vu X Vv = 0. Let dr be the 
vector from P to Q; P and Q are points on the surface u(z, y, z) = constant. From 
dv = dr: Vv, show that dv = 0 and hence that v remains constant when w is constant. 
This implies that v = f(u) or F(u, v) = 0. Show conversely that if u and »v satisfy a 
oF 


relationship F(u, v) = 0 then Vu X Vv = 0. Hint: VF(u, v) = 0 = a Vu + = Vv. 


© I OP C0 Do 


We assume that - and a do not both vanish identically. 


14. Prove that a necessary and sufficient condition that u, v, w satisfy an equation 
F(u, v, w) = 0 js that Vu - Vo xX Vw = 0, or 


“du Ou du 
az ay «bz 
(eM) a ee WI Lg 
ZX, Y, 2 ox Oy Oz 
ow dw dw 
Ox ay dz 
This determinant is called the Jacobian of (u, v, w) with respect to (2, y, 2). 
16. Let A = V X (¢i), V9 = 0, o@ = X(x)Y(y)Z(z). Show thatA-V XA =0. 
16. Show that (V X f)°(u Xv) = [(u- V)f] -v — [(v° Vf] °u. 


2.16. Orthogonal Curvilinear Coordinates. Up to the present moment 
we have expressed the formulas for the gradient, divergence, curl, and 
Laplacian in the familiar rectangular cartesian coordinate system. It is 
often quite necessary to express the above quantities in other coordinate 
systems. For example, if one were to solve V?V = 0 subject to the 
boundary condition that V = constant on the sphere x? + y? + z? = a?, 
one would find it of great aid to express V?V in spherical coordinates. 


VECTOR ANALYSIS 63 


The boundary conditions of a physical problem dictate to a great extent 
the coordinate system to be used. 

Let us now consider a spherical coordinate system (see Fig. 2.23). 
The relationships between z, y, z and r, 6, ¢ are 


r= (x? + y? + 2%) 
a Z an 
@ = cos "GEa yee (2.67) 
-1Y 
= 12 
y = tan Fs 
and x=rsin @cos¢ 
y =rsin 6sing (2.68) 
z=rcos 0 


Let us note the following pertinent facts: Through any point P(z, y, 2), 
other than the origin, there pass the sphere r = constant, the cone 
6 = constant, and the plane o = 
constant. These surfaces intersect 
in pairs which yield three curves 
through P. The intersection of the , 
sphere and the cone is a circle. 
Along this curve only ¢ can change, 
since r and @ are constant. This 
curve is called appropriately the 
y curve. At P we construct a unit 
vector, e,, tangent to the ¢ curve. 
The direction of e, is chosen in the 
direction of positive increase in ¢. 
Similarly one obtainse,ande,». The 
reader can easily verify that these 
unit tangents form an orthogonal trihedral at P such that e, X es = e,. 
The vectors e,, €¢, €, form a basis for spherical coordinates in exactly 
the same manner that i, j, k form a basis for rectangular coordinates. 
Any vector at P may be written f = f,e, + foes + f,e,, where f,, fo, f, 
are the projections of f on the vectors e,, e9, e,, respectively. Unlike 
i, j, k the vectors e,, es, e, change directions as we move from point to 
point. Thus we may expect to find more complicated formulas arising 
when the gradient, divergence, etc., are computed in spherical coordinates. 

Since Vr is perpendicular to the surface r = constant, Vr is parallel 
toe,. Similarly V@ is parallel to e,; V¢ is parallel to e,. Thus 







. (x, ¥, 2) 
P: (r, 8, $) 


Fia. 2.23 


e, = h,Vr 
Qs = hwVA (2.69) 
e, = h,V¢e 


64 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Let dr be a vector of length ds parallel to e,, Then dr = dr-Vr [see 
(2.50)], so that h, dr = dr-h,Vr = dr-e, = ds. Since dr = ds, we have 
h, = 1. Now let dr be a vector of length ds parallel to e¢. We have 
do = dr- V6, he d@ = dr-heV6 = dr-e, = ds. It is seen that he is that 
factor which must be multiplied into dé to yield are length. Thus 
he = r, and similarly h, = r sin 6. In spherical coordinates 


ds? = dr? + r? dé? + r? sin? 6 d¢? 


The positive square roots of the coefficients of dr?, d6?, dy? yield h,, he, he, 
respectively. 
We can now write 
e, = Vr = @6 X e, = r’ sin 6VO X Vo 
eg = rV0 =e, Xe, = rsin 6Ve X Vr (2.70) 
e, = rsin 6Vy = e, X @, = rVr X VA 


The differential of volume is 


dV = ds; ds. ds; = hrheh, dr dé de 
=r? sin 6dr dédy 


It is easy to show that Vr -V@ X Vo = (Arhoh,)-! = (r? sin 6)7}. 
Now we consider the gradient of f(r, 6, ¢). We have 


via tut tvet dy 





_ 1 af La, 1 af 
Thomo? | pee Ae Oe 

_of. , laf 1 af | 
gg ag oo. en hae” 2.41) 


Equation (2.71) is the gradient of: a scalar in spherical coordinates. 
To compute the divergence of f = fre, + fees + f,e,, we write 


f = f,r? sin 6V8 X Vo + fer sin 6Ve X Vr+f,rVr X Vo 
from (2.70). Then ‘ 
V-f = V(f.r’ sin 0) -V8 X Vo + V(fersin 6) -Ve X Vr + V(f pr) - Vr xX Ve 
since V:(V@ X Ve) = 0, etc. [see formula for V+ (u X v)]. But 


V(f,r? sin 0): V@ X Vo = <. (fr? sin 0)V@-V@0 xX Vo 
O(frr? sin 6 
4, (fa? sin 8) 


Veo: VOX Ve 
Op 
2 
+ dr? sin 8) OW x Wy 
— 
1 1 


a f.r? sin 0) = ei f,r? sin @) 


r? sin 6 Or 


~ A,hoh, Or 


VECTOR ANALYSIS 65 
We thus obtain 


V-f= sane 's (f,r? sin @) a, (for sin @) al — 2 te) | (2.72) 


r2 sin 6 


Equation (2.72) is the divergence of f in spherical coordinates. 
If we apply (2.71) and (2.72) tof = VV, we obtain 


1 


r? sin @ 


0 aV . OV 0 1 aV 
EG sin 6 = - )+ 3 (sm tele ss) (2.73) 


Equation (2.73) is the Laplacian of V in spherical coordinates. 
To obtain the curl of f, let £ = f,h,.Vr + foheVO + f,h,Vo. It can be 
easily shown that 


VV = 


he, heee hye, 
1 0 0 0 


hf, hefs hefe 


For the general orthogonal curvilinear coordinate system (uw, We, Us) 
where ds? = h? dui + h? du2 + h2 du2 we list without proof 


BO a ye Nl a Oh 


Ne gue hee ‘ au, °? 
I 
V-f = hihohs Ee (hehsf1) Wa — (hshifa) ae 2 Ciuhsh)| 
VX ES Thake| uy Sus Sus 
hifi heofe  hafs 


1 [0 (hohs af\ | 2 (rahi of f(t Ht) 
vd i hihehs 2 (A hy it) + sb (Balt he i) a OUs hs OU3 


It would be a good exercise for the student to derive (2.75). 
Problems 
1. For z = r sin 6 cos ¢, y = 7 sin 6 sin ¢, 2 = r cos # show that the form 
ds? = dx? + dy? + dz? 


becomes ds? = dr? + r? d6? + r? sin? 6 d¢?. 
2. For z = r cos ¢, y = 7 sin ¢, 2 = z show that ds? = dr? + r? dg? + dz?. 
3. Express Vf, V°f, V X f in cylindrical coordinates and show that 


11268) +8 (0) +26] 


66 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


4. Solve V?V = 0 in spherical coordinates if V = V(r). 

6. Solve V?V = 0 in cylindrical coordinates if V = V(r). 

6. Show that V X [f(r)r] = 0, r a spherical coordinate. 

7. By making use of V?V = V(V: V) — V X (V XV), find V’V for V = v(r)e,, V 
purely radial in spherical coordinates. 

8. Express V’V, for V = f(r)e, + ¢(z)e.z, in cylindrical coordinates. 


9. Consider the equation 
Q2 
(A + u)0(v +s) + avs = p< 


at? 
A, #, p constants. Assume s = eg, p constant, 7 = ~/—], and show that 
(A + w)V(V +81) + (u + pp?)s, = 0 


Next show that [V? + (uw + pp?)/A + u](V'°s:) = 0,\ + yu ¥ O. 
10. If A = V X& (ar), Vv = 0, ¥ = R(r)0(0)P(¢), show that A-V XA=0. Is it 
necessary that y = F(r)0(0)%(¢)? 
3 


11. If ds? = > dz‘ dz*, x* = x*(y}, y*, y’), 7 = 1, 2, 3, show that 
i=] 
ds? = gap dy% dy® 
3 
where = ce 
Jag oie oy® ay8 


= 
12. Derive (2.74). 
18. Derive (2.75). 


2.17. The Line Integral. We start with a vector field 


f= X(a, y, zit Y(a, y, z)j 
+ Z(x, y, 2)k 


Let r(t) = xi + y()j + zk, 
a St sb, bea space curve L join- 
ing the points A and B with posi- 
tion vectors r(a), r(b). One may 
subdivide T into n parts by sub- 
dividing ¢ into 





a=b<h<cbk<cere: 
Fra, 2.24 iL<-e's+ Ch =b 


(see Fig. 2.24). Let Ar, = r(t;) — r(t,-1), and let &, be any number such 
that 4.1 S & St. We can compute x(é,), y(&), 2(&), which yields the 
vector f(é,). One then forms the sum 

S, = ) £(&) - Ar; (2.76) 


J 


‘Iho: 


If lim S, exists independent of how the &; are chosen provided maximum 


N-> & 


VECTOR ANALYSIS 67 


|Ar,| > 0 as nm > &, we define this limit to be the line integral of f along 
the curve [ from A to B. 
As in the calculus the limit is written 


[i f-a = [.f-ar (2.77) 


If : continuous along I and if TF has continuous turning tangents, that 


= is continuous, then (2.77) exists from Riemann integration theory 


and can be written 


B b 
I f-dr = i [xtao, vo, 0) 28 + view, no, 20) HO 


a 


+ Ze), yO, |S} dt 2.77") 


We use (2.77’) as a means of evaluating the line integral. There will be 
some vector fields for which the line integral from A to B will be inde- 
pendent of the curve © joining A and B. Such vector fields are said 
to be conservative. If f is a force field, (2.77) defines the work done by 
the force field as one moves a unit particle (mass or charge) from A to B. 
We now work out a few examples and then take up the case of conserva- 
tive vector fields. 


Example 2.25. Let f = xyi + 2j — xzyzk, and let the path of integration be the 
curve r = t, y = @?, z = t, the integration performed from the origin O to the point 
P(1, 1,1), therange oftbeinggivenbyO S¢s 1. Alongthecurve,f = ¢%i + tj] — tk, 


and dr = a dt = (i + 2tj + k) dt so that f+ dr = (¢? + 2¢? — ¢4) dt. Hence 


P 1 43 
i far = | (t8 + 22 — t4) dt = 43 


If we choose the straight-line path + = t, y = t, z = ¢t from the origin to the point 
Pr 1 
P(, 1, 1), we obtain I f-dr = : (2 + ¢ — t8) dt = yy. It is seen that the vector 
field f is not conservative. 
Example 2.26. Let f = x*i + yj, and let the path of integration be the parabola 
y = x? from (0, 0) to (1, 1). Letz =¢sothaty = @,0 StS 1, and 
rt) =zi+ty=uit+ er 


(1,1) 1 1 
f-dr = ti + tj) > (i mjar= foe tt) dt = 7 
Jog f= f, Cite) G42 a= [e+ 27a = 
For the same f let us compute the line integral by moving along the z axis from z = 0 
to x = | and then moving along the line z = 1 from y = Otoy = 1. Although the 
continuous curve does not have a continuous tangent at the point (1, 0), we need not 
be concerned since one point of discontinuity does not affect the Riemann integral 


68 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


provided “ is bounded in the neighborhood of the discontinuity. We have 


(1,1) (1,0) (1,1) 
| fdr = | f-dr + f - dr 
(0,0) (0,0) (1,0) 


Along the first part of the curve z = t, y = 0, dr = dt, dy = 0. Along the second 
part of the curve x = 1, dr = 0, y = 1, dy = dt, and 


pe d fp east fo eae ¢ 
(0,0) me IG 0 ~~ «Te 


We become suspicious and guess that f 1s conservative. Notice that 
3 4 3 4 
f{=V 5 + a + constant ) = Vo g= x + a + constant 


Hence f° dr = Ve: dr = d¢ so that 


B B B 
i fdr = [ de = ¢| | = o(B) - @A) 


B 
Since ¢ is single-valued, the value of is f + dr depends only upon the upper and lower 


limits and is independent of the path of integration from A to B. 

Example 2.27. We have just seen from Example 2.26 that if f = Ve, ¢ single- 
valued, then f is a conservative vector field. Conversely, let us assume that ff - dr is 
independent of the path. We show that f is the gradient of a scalar. Define 


P(x,y,2) 


g(r, i, 2) = / f-dr 


Po(x0,1/0,20) 


The value of » depends only on the upper limit (we keep Po fixed). Then 


Q(x + Az,y,z) f 


OO Ana e) = tos a 
Q(x+ Az,y,2) 
and g(x + Az, y, 2) — 9(%, y, Zz) = ieee X (x, y, 2) dx + Y(a, y, 2) dy 


+ 4Z(x, y, 2) dz 


We choose the straight line path from P to Q as our curve of integration. Then 
dy = dz = 0, and 7 


xr+Azx 
els + Az, Y) z) ae g(r, uy 2) = | A (x, Y, z) dx 
< 


Applying the theorem of the mean for integrals yields 


e(x + Az, y, z) — o(z, y, 2) = X(E, y, 2) Ax xStSu+Az 


0 . Ag, 1 — ; Z 
aethat: <2 = “lin e(x + Az, y, 2) — 9%, Y 2) _ lm X(é, y, z) = X(a, y, 2) 
OZ ar 40 Ar Ar—0 
: ‘ ; boc. e 0g Oy 
assuming XY continuous. Similarly Y = aa Z= 52” 8° that 


a tein _ de. , de, , 
be AU TA lr ad az & Q.E.D. 


VECTOR ANALYSIS 69 


A quick test to determine whether f is conservative is the following: 
Note that, if f = Vy, then V X f = 0. Conversely, assume V x f = 0. 
Then for f = Xi + Yj + Zk we have 


aX _ aY 
dy ox 

oY dZ 

ae (2.78) 
az _ aX 

ax az 


Let 
o(z, y,2) = [* X(a, y, 2z)dx + [* ¥(eo, y, 2) dy 
ap [  Z(wo, yo, 2) dz (2.79) 


We now show that f = Vy. From the calculus 


[ne 
an = Ae, Y, 2) 


a Le ib dx + Y(xo, y, 2) = i oF dex + Y(xo, y, 2) 


oy 
Y(z, - z) ie Y (xo, Y, z) =F Vc, 1 Y; z) 
Y(z, y, 2) 


“e “eae + "2 om dy + 2(ro, Yo, 2) 





2 as + "2 S— dy + Z(to, yo, 2) 


= re, YU, z) as Zita, Y, z) a Z (Xo, Y, z) = Z (Xo, Yo, z) — Z (Xo, Yo, z) 
= Z(x, y, 2) 


Hence f= Xi+ Yj+ Zk = eit cit sok = Ve 


The constants xo, yo, 20 can be chosen arbitrarily. 

We have proved that a necessary and sufficient condition that f be the 
gradient of a scalar is that V X f = 0. Thus for f conservative we have 
f= Ve or VX f=0. We also say that such an f is an irrotational 


vector field. 
Example 2.28. It is easy to show that f = 2xye*i + r%e*j + x?ye*k is irrotational. 
Then 
Zz Vy 2 
g(x, y, z) = I, 2rye? dx + ip Oe? dy + i 0?-0- e% dz 


= xz*ye* 
and f = V(x?ye? + constant) 


70 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problems 


1. Given f = e7i — zyj, evaluate ff-+ dr along the curve y = x? from the origin to 
the point (2, 8). 

2. f = (y + sin z)i + zj +2 cos zk. Show that f is conservative, and find ¢ so 
that f = Ve. 

8. Let f = —yi + rj. Evaluate {f+ dr around the circle with center at the origin 
and radius a. 

4. Show that if f = Vy, ¢ single-valued, the line integral around a closed path, 
written $f + dr, vanishes. Prove the converse. 

6. Let f = (—yi + zj)/z? + y?. Show that f = V tan™! (y/z) and integrate f 
around any closed path surrounding the origin, and show that for this path 


Sf-dr = 2x 


Why does this integral not vamish? See Prob. 4. Notice that f is not defined at the 
origin. The curve of integration contains the origin in its mterior. This will be 
important in complex-variable theory. 

6. If A is a constant vector, why 1s it true that SA°*dr = 0? 


2.18. Stokes’s Theorem. We begin by studying the locus of the end 
points of the vector 


r= x(u, »)i + y(u, v)j + 2(u, v)k (2.80) 


where u and v range over a continuous set of values and 2, y, z are assumed 
to have continuous partial derivatives in u and v. For a fixed v = 0 
the end points of r trace a space curve as we let u vary continuously. 
For each v a space curve exists, and if we let uw vary, we obtain a locus 
of space curves which collectively form a surface. The curves obtained 
by setting v = constant are called the wu curves, and the curves obtained 
by setting wu = constant are called the v curves. We thus have a two- 
parameter family of curves forming the surface. 

A simple example will illustrate what we have been talking about. 
Consider 


r = constant 
r=rsin@cosgi+rsin@singjt+trcos@k OS % S Qr (2.81) 
OS 6S 


We use 6 and ¢ instead of u and v. Let us notice thatr:r = r? = con- 
stant, so that the end points of the vector r lie on a sphere of radius r. 
For a fixed @ = 6) the z component of r, namely, r cos #9, 18 Constant. 
For 0 S » S 2x the end points of r trace out a circle of latitude. The ~ 
curves are thus circles of latitude. It is easy to show that the 6 curves 
are the meridians of longitude. We can show that the 6 curves intersect 


the ¢ curves orthogonally. The expression - represents a vector tangent 


.. OF 
to a @ curve, while de represents a vector tangent to a g¢ curve. From 


VECTOR ANALYSIS 71 


0 e s e 2 
= = rcos @cos gi+prcos @sin gj — rsin@ék 
or é . ; ; : 
ig = —rsin @sin gi+rsin 6 cos gj 
Or or 
we have 36°30 7 0. Q.E.D. 


If we move from a point P(u, v) to the point Q(u + du, v + dv), P and 
@ on the surface, then dr = PQ 7 = du + = do. Hence 


or or OY Or or or 
ds? = dr- dr = ne + ee + aa one 
where ds is arc length. For the sphere ds? = r? d6? + r? sin? @ dg’. 

Let us now consider a surface of the type given by (2.80) bounded by 
a rectifiable curve I that lies on the surface (see Fig. 2.25). 

The vector ra x = is perpendicular to the surface since = and se 
are tangent to the surface. As we move along the curve I, keeping 
our head in the same direction as 
- x om we keep track of the area 
to our left. It is this surface, S, 
with which we keep in touch. I 
will be called the boundary of S. 
We neglect the rest of the surface 
r(u, v). We now consider a mesh 
on the surface formed by a collec- 
tion of parametric curves (the u and 
v curves). The mesh will be taken 
fine enough so that, if (u, v) are the 
coordinates of A, then (wu + du, v), (u + du, v + dv), (u, v + dv) are the 
coordinates of B, C, D, respectively (Fig. 2.25). Now consider 


Discs f- dr 


The value of f at A isf(u, v); at Bitisf(u + du, r);atC itisf(u + du, 
v + dv); at D it is f(u,v + dv). Now 


f(u + du, v) = f(u, v) + df, 
of 
= f(u, v) + ai du 





Fig. 2.25 


Or 
= f(u, v) + au(2v)s 


72 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


except for infinitesimals of higher order. Similarly 


f(u, v + dv) = f(u, v) + dv & . v)é 


Hence, but for infinitesimals of higher order, 


or or or 
$ nop '* at mt Hans [et du(%-v)e|- Fa 


a or or 
ee aye jf+ do(%-v)t]- Hau 
or or or or 
| (2) | “Ov (=. v) | ‘_] du dv 


or or 
(V xX f)-— x ae 


Ou 
(see Prob. 16, Sec. 2.15). 
The vector < du X x dv is normal to the surface S. Its magnitude 


is the area of the sector ABCD, except for infinitesimals of higher order, 
since ABCD is, strictly speaking, not a parallelogram. We define 


_ or. or 
dé = au x a0 du dv (2.82) 
so that f ncpf' dt =V xX f£- de 


except for infinitesimals of higher order. 
We now sum over the entire network. Interior line integrals cancel 


out In pairs, leaving only f, f-dr. Also 


Zena IJ (V xX f)- de 


. 


as the mesh gets finer and finer. We thus have Stokes’s theorem 


ff: dr =| (V x f)-de (2.83) 


Comments 


1. Since [ may not be a parametric curve, (2.82) may not hold for a mesh circuit 
containing I as part of its boundary. This is true, but fortunately we need not worry 
about the inequality. The line integrals cancel out in pairs no matter what sub- 
division we use, and for a fine network the contributions of those areas next to I con- 


tribute little to I | V Xf-dé. The limiting process takes care of this apparent 
S 


negligence. 


VECTOR ANALYSIS 13 


2. Stokes’s theorem has been proved for a surface of the type r(u, v) [see (2.80)]. 
The theorem is easily seen to be true if we have a finite number of these surfaces con- 
nected continuously (edges). The case of an infinite number of edges requires further 
consideration. 

3. Stokes’s theorem is also true for a surface containing a conical point where no dé 
can be defined. We just neglect to integrate over a small area covering this point. 
Since the area can be made arbitrarily small, it cannot affect the integral. 

4, The reader is referred to the text of Kellogg, ‘‘ Foundations of Potential Theory,”’ 
for a more rigorous proof of Stokes’s theorem. 

5. The tremendous importance of Stokes’s theorem cannot be overemphasized. It 
relates a line integral to a surface integral, and conversely. 

6. In order to apply Stokes’s theorem it is necessary that V Xf exist and be 
integrable over the surface S. 


Examples of Stokes’s Theorem 


Example 2.29. Let f = —yi + rj, and let us evaluate f.. f-dr, Y any rectifiable 


curve in the zy plane. Applying Stokes’s theorem, we have 


f far = [fv xt) Kay ax = ff 2K ay az = 2A 
S S 


Thus the area A, bounded by IP, is 

A=¢fxrdy — y dz (2.84) 
For the ellipse x =acost, y=bsint, 0S tS 27, we have dr = —asintdt, 
dy = bcostdt,and A =% I ab(cos? ¢ + sin? ¢) dt = ab. 


Example 2.30. If Vv Xf = 0 everywhere, it follows from Stokes’s theorem that 
£f+dr = O around every closed path. Conversely, assume gf + dr = 0 around every 
closed path, and assume V X fis continuous. If V Xf #0, thenV Xf # 0 at some 
point P. From continuity we have V X f ~ 0 in some neighborhood of P and V &X f 
nearly parallel to (V X f)p in this neighborhood. Choose a small plane surface S 
through P with boundary I in this neighborhood of P. The normal to the plane is 


chosen parallel to (V X f)p. Then f f-dr = f / Vv X f£-dé > 0, a contradiction. 
S 


An irrotational field is characterized by any of the three conditions 


(i) f = Ve 
(i) Vx f =0 (2.85) 
(ii) #f-dr =0 for every closed path 


Any of these conditions implies the other two. 


Example 2.31. Let f = f(z, y, z)a, where a is any constant vector. Applying 
Stokes’s theorem yields 


f jaar = [fv x (Ya) -ds 
S 
a hfar= [ff erxacdema: [faexey 


74 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


so that a: (¢ far— f / dé X vf) z= 0. Since a is arbitrary, it follows that 
Ss 


f far 52 I de x vf (2.86) 
It can be shown that . 
fdr +f = ff (do X V) *f (2.87) 


The asterisk can denote scalar, vector, or ordinary multiplication. In 
the latter case f becomes the scalar f/f. 


Problems 
1. Show that @r-dr = 0 by two methods. 


2. Show that f aXr-dr = 2a- [f dé if a is constant. 
S 


8. By Stokes’s theorem show that VY X Vo = 0. 
4. Prove that f uVo:dr = If Vu X Vo- dg. 
S 


5. Prove that fuVo-dr = — fSvVu- dr. 
6. If f = cos yi + z(1 + sin y)j, find the value of §f+ dr around the circle with 


center at the origin and radius r. 


1 0B 


7. IPE ar = se ¢ Jf B+ a¢ for all surfaces S show that V XE= —-- —- 
c al c ot 
S 


8. Show that f / V Xf-°dé = Oif S is a closed surface. 
S 


9. f = (x? — y?)i + Qzyj. Find gf-dr around the square with vertices at 
(0, 0), (1, 0), (1, 1), (0, 1). Do this by two methods. 

10. If a vector is normal to a surface at every point, show that its curl is zero or is 
tangent to the surface at each point. 

11. Let f = a X g, a any constant vector. Apply Stokes’s theorem to f, and show 


that dr xXg= If (dé XV) X g. 
é = 
12. Assume V Xf # 0. Show that, if a scalar u(z, y, z) exists such that 


VX (uf) = 0 


then f-V Xf = 0. 

18. Show that | fdr X r| taken around a closed curve in the ry plane is twice the 
area enclosed by the curve. 

14. Let f = X(z, y)i + Y(z, y)j, and let S be the area bounded by the closed curves 
Yr, and lr: lying on the zy plane, I; interior to [,. Show that 


jean? f-dr — f-dr 
Ts T: 


Both line integrals are taken in the counterclockwise sense. 


VECTOR ANALYSIS 75 


2.19. The Divergence Theorem (Gauss). Let S be a closed surface 
containing the volume V in its interior. We assume S has a well-defined 
normal almost everywhere. We now subdivide the volume into many 
elementary volumes. From Sec. 2.13 we note that except for infini- 
tesimals of higher order 

[[ #-de=v- ta. 


AS 


where AS is the entire surface bounding the elementary volume Ar. If 
we sum over all volumes and pass to the limit as the maximum Ar — 0, 


we obtain 
ee J V:fdr (2.88) 


Equation (2.88) is the divergence theorem of Gauss. It relates a 
surface integral to a volume integral. It has tremendous applications 
to the various fields of science. In the derivation of (2.88) use has been 
made of the fact that for each internal dé there is a —de, so that all 
interior surface integrals cancel in pairs, leaving only the boundary 
surface S as a contributing factor. 

Equation (2.88) may be interpreted as follows: Any vector field f may 
be looked upon as representing the flow of a fluid, f = pv. From Sec.. 
2.13 V - f represents the loss of fluid per unit volume per unit time. The 


total loss of fluid per unit time throughout V is | | V:fdr. Now if 
Vv 


f and V -f are continuous in V, there cannot be any sources or sinks in V 
which would create or destroy matter. Consequently the total loss of 
fluid per unit time must be due to the fluid leaving the surface S. We 
might station a great many observers on the boundary S, let each observer 
measure the outward flow of fluid, and then sum up their recorded data. 
At a point on the surface with normal vector area dé the component 
of the velocity normal to the surface is v- N, and pv: dé = f - dé repre- 
sents the outward flow of mass per unit time. The total loss of mass 


per unit time is | / f-dé, and thus (2.88) is obtained. For a more 
s 


detailed and rigorous proof of Gauss's theorem, see Kellogg, ‘‘ Founda- 
tions of Potential Theory.”’ 


Example 2.32. “Let E = gr/r?, and let V be a region surrounding the origin with 
boundary surface S. We wish to compute J E-dé. We cannot apply the diver- 


gence theorem to the region V since V° E is discontinuous at r = 0. We overcome 
this difficulty by surrounding the origin by a small sphere = of radius e with center at 


76 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


the origin (see Fig. 2.26). The divergence theorem can be applied to the region V’ 
(V less interior of 2 sphere) with boundaries S and . Thus 


ff E+ de + if E-dé = fff v ‘(4)a 
8 2 y’ 
We have seen that V° (qgr/r?) = 0, r # 0, so that 


ff ede - ff Sas 
r 


S 


Now on 2, r = «, dé = (—r/r) dS, so that (gr/r?) + dé = (—q/e?) dS and 


qr q 
[[ G-w--Sf dS = —4ng 
= 
[[2-a = 41g 


S 


Hence 


E = qr/r® is the electrostatic field due to a point charge q at the origin. For a con- 





Fig 2.26 


tinuous distribution of charge of density p in S it can be shown that 


“Rees Uff a 


Assuming that the divergence theorem cun be applied (see Kellogg’s ‘‘ Foundations of 
Potential Theory” for proof of this fact), then 


jp Peeee (2.89) 


Since (2.89) holds for all volumes we must have V: E = 4zp, provided V- E — 4xp is 
continuous. In a uniform dielectric medium V: D = 4rp, D = «cE. For magnetism 
one hasV:B =0Q. The equations V: D = 4rp, V+ B = 0 comprise two of Maxwell’s 
equations. It can easily be shown that E = —VYV for an inverse-square force. In 
empty space p = 0 so that V?V = 0. A great deal of electrostatic theory deals with 
the solution of Laplace’s equation, V?V = 0. 


VECTOR ANALYSIS 77 


Example 2.33. Green’s Theorem. We apply (2.88) to f; = uV»v and f. = vVu and 


obtain 
jy ¥ ame ih uVv -dé 


/ i (uV2%v + Vu- Vv) dr 


; (2.90) 
SI V-(vVu) dr = If (oV2u + Vo-Vu)dr = If vVu - dé 
V S 
Subtracting, we obtain 
If (uV2v9 — vV2u) dr = i (uVv — vVu) - dé (2.91) 
V S 


Equations (2.90) and (2.91) are Green’s formulas. 

Example 2.34. A Uniqueness Theorem. Let ¢ and y satisfy Laplace’s equation 
inside a region R, and let gy = y on the boundary S of R. Weshow that » = y. Let 
6=g—y. Hence V20 = V¥o — Vy =O in R, and 6=00nS8S. Applying (2.90) 
with wu = v = 6 yields 


Sf (6V20 + V0-V0) dr = I 6V0-dé 


so that J i y (vé- V0) dr =0. Since Vé@ is continuous (V20 is assumed to exist), we 
R 


have V@é = 0 in R and 6 = constant. Thus ¢ — ¥y = constant. Assuming ¢ — y is 
continuous as we approach the boundary, we must have ¢ = y, since ¢ = y on the 
boundary. 

Example 2.35. Letf = f(z, y, z)a, where ais any constant vector. Applying (2.88) 


_ as |f sa - I} wala m ae II} Uf de 
Hence [[ ta = If Uf dr 


S 


We leave it to the reader to show that 


/ do *f = If (V *f) dr (2.92) 
S R 


Example 2.36. A vector field f whose flux (fp f : as) over every closed surface 
S 


vanishes is called a solenoidal field. From (2.88) it follows that V-f = 0. Now 
assume f = V X g, so that V:f = V-(V Xg) = 0. Thus the curl of a vector is a 
solenoidal vector. Is the converse true? If V:f = 0, can we write f = V X g? 
The answer is ‘‘yes,”’ and we call g the vector potential of f. Notice that g cannot be 
unique for V X (g + Ve) = V Xg. We now exhibit a method for determining g. 
Let f = Xi + Yj + Zk, and assume g = ai + Bj + yk. We wish to determine g so 
that V X g = f provided V-f = 0. Thus a, 8, y must satisfy 


78 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Oy op 

Oy oz 7 

Oa Oy 

ae ae (2.93) 
ob _ 0a 

Ox OY 


We are to determine a, 8, y. Assumea = 0. Then 


oy _ 98 Ly 
OY Oz 
Oy 
“a= 
Op _ 
gs 


Of necessity, 
pie, v2) = f° 2G, 2) de + oly, 2) 


zx 
y(z, y, 2) = — ie Y(z, y, z) dx + rly, 2) 
dy ap _ [rt faY , a or _ a0 
Hence ay az ie (F + 5)" * By oe 
ona e 
xo OX OY dz 





since V:f = 0. Therefore 


oy 0B x Or Oo 
Or Oo ‘ 
We need only choose o and 7 so that aio oe X (2%, y, 2). Leto = Oand 


y 
T(y, z) -| X (Xo, Y; z) dy 
0 
Zo and yo are constants of integration. Hence 
& = j [ Z(z, Y) z) dz + k [ +, é) oe i Y(z, Y, z) az | + Ve (2.94) 
To 


where 7(y, z) 1s defined above and ¢ is arbitrary. 
For example, let f = 74i — zyj — zzk so that V-f = 0. Choosing zo = yo = 0 


yields r(y, z) = [row = 0, [22a a fp zac = —(zx?/2), 
0 


z —s xz - — (yz 
|p va y [Peas (4) 


and g = —(zx?/2)j + (yz?/2)k + Ve. It is to verify that f = V &X g. 

Example 2.37. Integration of Laplace’s Equation. et S be the surface of a region 
R for which V2¢e = 0. Let P be any point of R, and let r be the distance from P to 
any point Qin & or on S. We make use of Green’s formula 


Sf (eV*y — YV%e) dr = fi / (eVy — We) * de 
Ss 


VECTOR ANALYSIS 79 


We choose ¥ = 1/r, which yields a discontinuity inside R, namely, 
at P, where r = 0. In order to overcome this difficulty, we proceed as 
in Example 2.32. Surround P by a sphere = of radius e. Using the 
fact that V?o = V*y = Oin RF’ (RB less the = sphere) yields 


[f (or —B30) a0 ff (Ese ~ 00) a 


We leave it to the reader to show that, as e— 0, ; / (1/r)Ve-dé— 0 
Zz 


and [f gV(1/r) -dé— —4ry(P). Hence 
z 


g(P) = 7: | ¢ Vo — ¢V ‘) - dé (2.95) 
S 


This formula states that the value of ¢ at any point P in R is deter- 
mined by the value of gy and Ve -N = se on the surface S, where N is the 
unit vector normal to S. 

Problems 

1. Tf f = zi — yj + (2? - Lk, find the value of I , f -dd over the closed surface 
hounded by the planes z = 0, z = 1, and the sri x? + y? = ], 

2. Show that zi + yj/z? + y? is solenoidal. 


8. Prove that i; dé = 0 if S is a closed surface. 
S 


4. Prove that If dé Xf = If V Xfadr. 
Ss V 

5. Show that If {*Vedr = If gf -dé — I) eV °f dr. 
V S 


6. If w = 4V Xv, v =V X yu, show that 
+ fff vide = + |f (u X v)°dé + fff ewe 
R S 


7. liv = V¢, V2¢ = 0, show that for a closed surface S{ vedr = If gv: dd. 
S 


8. If f; and f; are irrotational, show that f; X f. is solenoidal. 
9. Find a vector g such that yzi — zrj + (7? + yk = V XX g. 
10. Find a vector g such that r/r? = V &X g. 


11. If — J; / eee Ii 2P dr for all surfaces, show that 5° + V+ (ov) = 0. 
8 


12. Find a vector f such that V-f = 22 +y —1,V Xf = 2. 


80 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


REFERENCES 


Brand, L.: ‘‘ Vector and Tensor Analysis,’”’ John Wiley & Sons, Inc., New York, 1947. 

Kellogg, O. D.: “‘Foundations of Potential Theory,’’ John Murray, London, 1929. 

Lass, H.: ‘“‘Vector and Tensor Analysis,’”” McGraw-Hill Book Company, Inc., New 
York, 1950. 

Phillips, H. B.: ‘‘ Vector Analysis,’’ John Wiley & Sons, Inc., New York, 1933. 

Rutherford, D. E.: ‘Vector Methods,’ Oliver & Boyd, Ltd., London, 1944 

Weatherburn, C. E.: “Elementary Vector Analysis,’ George Bell & Sons, Ltd., 
London, 1921. 

: “Advanced Vector Analysis,’’ George Bell & Sons, Ltd., London, 1944. 





CHAPTER 3 


TENSOR ANALYSIS 


3.1. Introduction. In this chapter we wish to generalize the notion 
of a vector. In Chap. 2 the concept of a vector was highly geometric 
since we looked upon a vector as a directed line segment. This spatial 
concept of a vector is easily understood for a space of one, two, or three 
dimensions. To extend the idea of a vector to a space of dimension 
higher than 3 (whatever that may be) becomes rather difficult if we 
hold to the simple idea that a vector is to be a directed line segment. 
To avoid this difficulty, we look for an algebraic viewpoint of a vector. 
This can be done in the following manner: In Euclidean coordinates the 
vector A can be written A = Aji + A2j + Ask. We can represent the 
vector A by the number triple (A:, A2, As) and write A = (A, Ae, As). 
The unit vectors i, j, k can be represented by the triples (1, 0, 0), (0, 1, 0), 
(0, 0, 1), respectively. We define addition of number triples and multi- 
plication of a number triple by a real number a as follows: 


(Ai, Ae, As) + (Bi, Bo, Bs) = (Ai + Bi, Ae + Bo, Az + Bs) (3.1) 
a(A,, Ae, As) = (@Aj, aAe, aAs3) ° 


Equations (3.1) define a linear vector space. We note that 
(Al, A», A3) rae Aa(l, 0, 0) = A2(0, 1, 0) ale A;(0, 0, 1) (3.2) 


The elements A, Az, A; of A are called the components of the number 
triple. 

Throughout this chapter we shall use the summation convention, and 
at times, for convenience, the superscript convention, of Chap. 1. To 
continue our discussion of vectors, let A = (Ai, Az, As), B = (B', B?, B®) 
be two vectors represented as triples. We define the scalar, or inner, 
product of A and B by 

A-B = A,Be (3.3) 


The square of the norm (or length) of the vector A is defined to be 
L? = A,A*, A; = Atji=1,2,3. If L? = 1, Ais a unit vector. The 
cosine of the angle between two vectors is defined by 

81 


82 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
A,B 


cos 6 = (A, A°B BA! (3.4) 
It is not difficult to show that |cos 6| S$ 1. We must show that 
A,A'‘B,B: = (A,B)? (3.5) 
Let us consider 
Som. 
y= ) (Aor — Ba)? 20 (3.6) 
az=1 


for x real. 

Now y = A,A‘z? — 2A,B*x + B;B’ represents a parabola. Since y 2 0 
for all real z, y = 0 has no real roots or two equal real roots. Hence 
(3.5), the Schwarz-Cauchy inequality, holds. Let the reader show that, 
if the equality sign holds in (3.5), then Az = AB,, a = 1, 2, 3. If 
A>0,cosd6=1. IfA <0, cos6 = —1. If cos 6 = 0, that is, 


A,B« = 0 


we say that A and B are orthogonal. 

We can define the vector, or cross, product of A and B algebraically 
as follows: Let A‘, B‘, 7 = 1, 2, 3, be the components of A and B, respec- 
tively. Let C., k = 1, 2, 3, be the components of the number triple 
C= (C,, C2, C3), where 

Cy = €%A'B (3.7) 


The epsilons of (3.7) were defined in Sec. 1.2. We note that 
C; = €,1:A'B! = €23;A 7B? + €323A*B? 


— A2B3 — 43? 
Cy = AB! — AIB ee 
Cy = A‘B? — AB? 


Let us consider the scalar product of Aand C. We have 


A-C = A*C, = 6,A‘A*B 
xn A¥A*B? 


But ej; = —e 80 that A-C = 0. Similarly B-C = 0. 
If A = (A(t), Ao(t), As(é)), then - is defined to be the triple (24, 
dA, dA.) 
dt’ dt 
the gradient of ¢ is defined to be the triple 


dg dy d¢ 
grad g = Vo = (28, ar?’ 5) (3.9) 


If v(x}, x, 2*) is a scalar function of x = z1,y = x’, z = 2, 


TENSOR ANALYSIS 83 


It is easy to define the divergence and curl of a vector in terms of 
number triples. Let the reader show that 





dA, 04; 
Boos othe J 
B et (24 As) (3.10) 
yields the conventional V x A. The divergence of A is the scalar 
aA® 


The definitions above refer to Euclidean coordinates. 

In the above presentation geometry has been omitted. Everything 
depends on the rules for manipulating the number triples. One need 
only define an n-dimensional vector as an n-tuple 


A = (A), Ag, As, os 8 8 yg An) (3.12) 


The definitions for manipulating triples are easily extended to the case 
of n-tuples. There is some difficulty in connection with the vector 
product and the curl. This will be discussed later. 

Let us hope that the reader does not feel that it is absolutely necessary 
to visualize a vector in a four-dimensional space in order to speak of such 
a vector. He may feel that an abstract idea can have no plaee in the 
realm of science. This is not the case. No one can visualize a four- 
dimensional space. Yet the general theory of relativity is essentially 
a theory of a four-dimensional Riemannian geometry. 

One further generalization before we take up tensor formalism. Let S 
be a deformable bodywhich is in a given state of rest. Let P(z', x?, x*) 
be any point of this body. Now assume that the body is displaced from 
its position to a new position. Let s(x}, x7, 2*),7 = 1, 2, 3, represent the 
displacement vector of the point P. The displacement vector at a nearby 
point Q(z! + dz}, x? + dx’, x* + dz*) is 
08: 


a2) (3.13) 


(rl 2 3 
s(x', v?, x) + 3, 


except for infinitesimals of higher order. We can write 


dm 1 (ds , a) , 1 (ds _ dss 

dx? 2 (2 1 ay 5 (3s aa coe) 

The nine terms of + (2% 4 9%), ; j = 1, 2, 3, are highly important; 
e nine terms of 5 api Tacs 1,j = 1, 2, 3, are highly important in 


deformation theory. It is convenient to represent these nine terms as 
the elements of a 3 by 3 matrix. A simple generalization tells us that 


84 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


we may wish to speak of a large collection of elements represented by 


T12 *(a}, 2, 2. . , 2”) 


© Ds a ee ae Nk SU ye ahs eS Ng ayn pace at BMD (3.15) 


This is our first introduction to tensors. Before we speak of ten- 
sors, we must say a word about coordinate systems and coordinate 
transformations. 

The totality of n-tuples (z!, x7, . . . , 2") forms the arithmetic n-space, 
the z'real,z = 1,2, ...,n. By aspace of n dimensions we mean any 
set of objects which can be put into one-to-one correspondence with the 
arithmetic n-space. Thus 


Pree 2") oe gt”) (3.16) 


The correspondence (3.16) attaches an n-tuple to each point P of our 
space. We look upon (3.16) as a coordinate system imposed on the 
space of elements P. We now consider the n equations 


YS YA Es a og ee Ce me () (3.17) 
and assume that we can solve for the x* so that 
Sy aes 5 OP) DS TD at an (3.18) 


We assume that (3.17) and (8.18) are single-valued. The reader may 
read that excellent text, ‘‘ Mathematical Analysis,” by Goursat-Hedrick 
on the conditions imposed upon (3.17) in order that (3.18) exists. The 
n-space of which P is a point can also be put into one-to-one correspond- 
ence with the set of n-tuples of the form (y!, y?, . . . , y"), so that a new 
coordinate system has been imposed on our n-space. The point P has 
not changed, but we have a new method for attaching coordinates to the 
elements P of our n-space. It is for this reason that (3.17) is called a 
transformation of coordinates. 
3.2. Contravariant Vectors. We consider the arithmetic n space and 
define a space curve, I, mn this V, by 


x = z(t) CS TD. sw 
a<t<B (8.19) 


In a 3-space the components of a vector tangent to the space curve are 
dx! dz? dx’ 


Wea ae Generalizing, we define a tangent vector to the space curve 


1 2 n 
(3.19) as the n-tuple (“ ~ sey =), written 
dx‘ 


Wt 7=1,2,...,mn (3.20) 


TENSOR ANALYSIS 85 


The elements of (3.20) are the components of a tangent vector to the space 
curve (3.19). Now let us consider an allowable (one-to-one and single- 
valued) coordinate transformation of the type given by (8.17). We 
immediately see that 


y=y(z',v,...,a@) =ylr'@O, 2@,...,e@0)=y¥@ (@.21) 


1=1,2,..., mn. Equation (3.21) represents the space curve I as 

described by the y coordinate system. An observer using the y coordi- 

nate system will say that the components of the tangent vector to I are 
given by 

dy’ eae 

at t=) 2k eon (3.22) 

Needless to say, the x coordinate system describing TI is no more 

important than the y coordinate system used to describe the same curve. 

Remember that the points of P have not changed: only the description of 


1 2 - 
these points has changed. We cannot say that oe. sa mony 
dt dt dt 
1 2 n 
is the tangent vector any more than we can say that (%, vr, ocak a) 


is the tangent vector. If we were to consider all allowable coordinate 
transformations, we would obtain the whole class of tangent elements, 
each element claiming to be a tangent vector for that particular coordinate 
system. It is the abstract collection of all these elements which is said 
to be the tangent vector. We now ask what relationship exists between 
the components of the tangent vector in the x coordinate system and the 
components of the tangent vector in the y coordinate system. We easily 
answer this question since 


dy’ _ dy" dz* 
dt ox dt 








=1,2,...,mn (3.23) 


We note that, in general, oy depends on every a since the index a 


is summed from 1 ton. We leave it to the reader to show that 





dx' odx* dy? 
Gt ay dt (3.24) 
We now make the following generalization: The numbers A*(z!, 2?, 
.,2"),4 = 1,2, ... ,n, which transform according to the law 
Ar(ml m2 es oz" 1, 2 i 
A(zt, B62. , O) = Arai, 2, , 2") 1=1,2,...,n 


(3.25) 


86 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
under the coordinate transformation 
BS Be aks, 30") #=1,2,...,” (3.26) 


are said to be the components of a contravariant vector. The contra- 
variant vector is not just the set of components in one coordinate system 
but is rather the abstract quantity which is represented in each coordinate 
system x by the set of components A'(z). 

One can manufacture as many contravariant vector fields as one desires. 
Let (A'(x), A*(z), . . . , A™(x)) be any n-tuple in an 


SUG) Oe a, eg) 


coordinate system. In any # coordinate system related to the z coordi- 
nate system by (3.26), define the 4‘, 7 = 1, 2,..., , as in (3.25). 
We have constructed a contravariant vector field by this device. If the 
components of a contravariant vector are known in one coordinate system, 
then the components are known in all other allowable coordinate systems, 
by (3.25). A coordinate transformation does not yield a new vector; it 
merely changes the components of the same vector. We say that a 
contravariant vector is an invariant under a coordinate transformation. 
An object of any sort which is not changed by transformations of coordi- 
nates is called an invariant. If the reader is confused, let him remember 
that a point is an invariant. The point does not change under a coordi- 
nate transformation; the description of the point changes. 

The Jaw of transformation for a contravariant vector is transitive. 
Let 





A(z) = a. Ae (x) AE 
Then ie) = . SE AB(r) = $5 AP(c) 


which proves our statement. 


Example 3.1. Let X, Y, Z be the components of a contravariant vector in a 
Euclidean space for which ds? = dz? + dy? + dz?. The components of this vector 
in a cylindrical coordinate system are 


R= x4 CY +e “Z = cos 6X + sin 6Y 
sin 6 cos @ 
xX +5 aY +S, °Z = See eee 


ne txstrstens 


where r = (x? + y?)}, 6 = tan7! (y/xz),z =z. Notice that the dimension of 9 is not 
the same as the dimensions of X, Y, and Z. The quantity rO has the correct dimen- 


TENSOR ANALYSIS 87 


sions. (R, rO, Z) are the physical components of the vector as distinguished from the 
vector components (R, 90, Z). R, r8, and Z are the projections of the vector 


A = Xi+ Yj + Zk 
on the unit vectors e,, e¢, e. = k, respectively. 


Problems 


1. Show that, if the components of a contravariant vector vanish in one coordinate 
system, they vanish in all coordinate systems. 

2. If At and Bare contravariant vector fields (the A* and Bt,i = 1,2, ... ,n, are 
really the components of the vector fields), show that Ct = At + Bt is also a contra- 
variant vector field. 

3. What can be said of two contravariant vectors whose components are equal in one 
coordinate system? 

4. If X, Y, Z are the components given in Example 3.1, find the components in a 
spherical coordinate system. By what must © and ® be multiplied to yield the 
physical components? 

5. Let A* and B* be the components of two contravariant vector fields. Define 
C1(¢) = At(r)B?(z), Cu(#) = At(2)B1(2). Show that 


x Of* OZ! 
% ees ap aB 
Cue) = 2 5 C*#i(z) 
6. Referring to Prob. 5, show that 
4 — OX’ Oz! 
C(x) aa ozs C8 (z) 


3.3. Covariant Vectors. We consider the scalar point function 
g(x!, 22, . . . , 2"), which is assumed to be an absolute invariant in that 
under the allowable coordinate transformation Z* = #*(x!, 77, ... , x"), 
7=1,2,...,%, 


BO 2 5.08 a. 9 BS Ol ae ee ee) 
= p(a'(%), e°(%), . . . , e*(Z)) (3.27) 
where 2°(%) = 2(7!, #, ..., 2), 7 = 1, 2,..., n, is the inverse 
transformation of (3.26). We form the n-tuple 
ae de ae 
(26, ane 26) (3.28) 


which is an obvious generalization of the gradient of g. Differentiating 
(3.27) yields 


ag ~ dn° aR ~ OF dre (3.29) 


Equation (3.29) relates the components of grad ¢ in the z coordinate 
system with the components of grad (¢ = ¢) in the Z coordinate system. 


88 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


More generally, the numbers A,(z!, 27, ...,2"%,7=1,2,...,% 
which transform according to the law 


Ox 
1 2 
av AAG. 2 i ee oi) 


te A 2 ee Nn 


A(?,#,...,2) = 





(3.30) 


are said to be the components of a covariant vector field. The remarks 
of Sec. 3.2 concerning contravariant vectors apply here as well. One may 
ask what the difference is between a covariant and a contravariant vector. 
It is the law of transformation. Compare (3.25) with (3.30)! The 
reason that no such distinction was made in Chap. 2 will be answered in 
Sec. 3.7. 

3.4. Scalar Product of Two Vectors. Let A‘(z) and B,(z) be the com- 
ponents of a contravariant vector and a covariant vector. We consider 
the sum A*B,. What is the form of A*B, in the % coordinate system? 
The letter Z is an abbreviation for the set (Z!, #7, ...,Z"). Now 


~, Ox = OX7 
a —_— B Ec — pena 
? - ox Ba = By ox 
rap Ox% OF 
SO that A B. = AB, a axe 
-~ = OF7 ce 
= A®Bs = AB. 


Hence the form of A*B, remains invariant under a coordinate transforma- 
tion. This scalar invariant is called the scalar, dot, or inner product 
of the two vectors A and B. 


Problems 
1. If A, = Ag oo, show that A, = Aq 22 
oz’ oz’ 
2. If g and y are scalar invariants show that 


grad (yy) = g grad y + y grad ¢ 
grad F(y) = F’(¢) grad ¢ 


3. If A, and B, are the components of two covariant vector fields, show that 
C., = A,B, transforms according to the law 


x ax axh % - 
C1(%) = Cap(Z) a5 a5; 
aa? age 
Oz? dx 
At and B, are the components of a contravariant vector and covariant vector, respec- 
tively. 


4. Show that C' = A'‘B, transforms according to the law Ci (2) = CB(z) 


TENSOR ANALYSIS 89 


6. Assume A*(x)B,(x) = A*(2)B,(2) for all contravariant vector fields At. Show 
that B, is a covariant vector field. 


ox 
6. Let 3, = 8a Ee Show that 


ita _ dg) dz a8 

Och = aa®) AF AF! 
3.5. Tensors. The contravariant and covariant vectors defined above 
are special cases of differential invariants called tensors. The com- 


ponents of the tensor are of the form 
Tein ey se ee) (3.31) 


where the indices a1, de, . . . , dr, D1, be, . . . , b, run through the values 
1, 2, ... , and the components transform according to the rule 


Ox: OE dxh Ore 


F[eies Oe. SO) eh Oe. Ge Mies te eet 
T5spa By (2 ) one axe axe oF ox (3.82) 








Pst E) = 


The exponent N of the Jacobian 








18 cplled the weight of the tensor. 


If N = 0, we say that the tensor se is absolute; otherwise the tensor 
field is relative of weight N. For N = 1 we have a tensor density. 
The tensor of (3.32) is said to be contravariant of order r and covariant 
of order s. If s = 0 (no subscripts), the tensor is purely contravariant, 
and if r = 0 (no superscripts), the tensor is purely covariant; otherwise 
we have a mixed tensor. The vectors of Secs. 3.2 and 3.3 are absolute 
tensors. If no indices occur, we are speaking of a scalar. 

At times we shall call reas: -g.(v) a tensor, although strictly speaking 
the various 7’s are the components of the tensor in the z coordinate 
system. 

Two tensors are said to be of the same kind if the tensors have the 
same number of covariant indices, the same number of contravariant 
indices, and the same weight. Let the reader show that the sum of two 
tensors of the same kind is again a tensor of the same kind. 

We can construct further tensors as follows: 

(a) The sum and difference of two tensors of the same kind are again 
a tensor of the same kind. 

(b) The product of two tensors is a tensor. We show this for a special 
case. Let 

















. = On |? en. ax? Ozx* 
OF 8 ak) Ax 
= ox |8 Oz* oF! 
eb ce. | 
S ot Sv Ox? Ax7 
Mee da? 0x* az* dx! 
~Qkly — 8 oT 








90 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


(c) Contraction. Let us consider the absolute mixed tensor A}. We 


have ; 
Ox? OF& 


dF? Ax 

The elements of A‘ may be looked upon as the elements of a matrix. 

Let us assume that we are interested in the sum of the diagonal terms, 

At, In (3.33) let 7 = 7, and sum. We obtain 

ox? OF ox' 
B®) Sg gage ~ AB jg = ABMs 

Ag(x) = Ax(z) 

Hence A'(zx) is a scalar invariant (invariant in both form and numerical 
alue). From the mixed tensor we obtained a scalar by equating a super- 
sript to a subscript and summing. Let us extend this result to the 

ebsolute tensor A¥,,, We have 


Ai(Z) = Ag(z) (3.33) 





A:(2) 


- Ox OF? Ox? Ox? Ox" 
yo = OD ca et, es 
Aiim = Aer dx dx ar 0%! AE" 


Now let 1 = 7, and sum. We obtain 


is: ee OX’ OX? AX? Ox" Ox" 
kim ~~ Ama DB DRK Om 


Oe 
Per Ax OE" AE™ OF! Ax? 


= A428 Ox" ox? Ox" ox? 
por Ox® OX" AE™ OxF 
ag OE Oat de" 
por Ax AX" 0X™ 
ag OL OX? OX" 
per Oxo OE AE" 





5 








=A 


so that A°%, are the elements of an absolute mixed tensor. We may 
write Bc, = A? since the index o is summed from 1 to n and hence 
disappears. Notice that the contravariant and covariant orders of Bz, 
are one less than those of Atj,,. It is not difficult to see why this method 
of producing a new tensor from a given one is called the method of 
contraction. 

(d) Quotient Law. One may wish to show that the elements B;,(z) 
are the elements of a tensor. Assume that it is known that A‘B, is a 
tensor for arbitrary contravariant vectors A‘. Then 


7 5 OE Ax? Axv 
AB = ASS 1 ora on OF 


a ox? ax? 
A'Bpy aa; age 


TENSOR ANALYSIS 91 


= dx? ax? a . 
so that A‘ (Bx - — Bz, oer .) = 0. Since A'is arbitrary, we must have 
dx dx 
By = Bar 55, OF! aE 


This is the desired result. If A‘B,..: is a tensor for arbitrary A‘, then 
Bi:::isatensor. This is the quotient law. 


Exagnple 3.2. The Kronecker delta 6 (see Sec. 1.1) 1s a mixed absolute tensor, for 


, OL! OF _ ax ABP _ age 58 
az* Ox' —«E* ax ~—sE* . 

If g(z!, 2%, ..., 2") = (2) ‘ an absolute scalar, g(t) = ¢(£), then 6) is an 
absolute mixed tensor whose components in the x coordinate system equal the corre- 
sponding components in the Z coordinate system. Conversely, let A‘(xz) be an 
isotropic tensor, that is, Aj) (Z(x)) = Aj(r). The 


: df* ox? 

A = AB ara gi 
Or! Oor' 
= are = Ae dye 
“1 (87A' — AMS) = 


- 
Of! ‘ ; 
Since — can be chosen arbitrarily, it follows that 
ox? 
Bt At = Aas 


Now 
ation 
that 






for all 7, j, a, o. Choosing 7 ~ 7 and a =<, it follows that A’ 
choose 1 =j, a =o, no summation intended. Then A? 
occurring. Hence the diagonal elements of A} are equal to gs 
A‘ = gd}. Let the reader show that if A}j is isotropic then . 


Aiy = o(x) 8,5] + fa)ogy (3.34) 


Example 3.3. Let g.,(x) be the components of an abeblute covariant tensor so that 


ax aah 

= Gab am og 

_ OF) ox 

and Ys) Doe or’ = Jao OF 


If gas = Joa, We have, upon taking determinants, that 




















Ge wean 
Pal ae] = 19! | ag 
Z Ox 
and [gus|* = |gu,[4 OF 
Now if A' are the components of an absolute contravariant vector, then At = A@ — 
and 
2 a 
—_* |S lolbae 
a OX 











” 9x8 


92 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Thus B* is a vector density. This method affords a means of changing absolute 


tensors into relative tensors. 
Example 3.4. Assume gag dr dz? is an absolute scalar invariant, that is, 


Gap d2% d2° = gag dr® dx 


oz 
Moreover we assume that gag = gga. From d&é* = ae dz* we have 





AE 9Zb 
Jaf = — dx dx” = guy dx dx” 
az oh 
or (gap Ox ax’ == au» ) dz# dx’ = 0 


for arbitrary dz*. It follows from Example 1.2, that 


of azB 
Juv = Jas art ax” 


Hence the gag(x) are the components of an absolute covariant tensor of rank 2 (two 


subscripts). 
Example 3.5. Let A; and B, be absolute covariant vectors. Let the reader show 


that 
Cy = A.B, ee A,B, 
is an absolute covariant tensor of rank 2 and that C,; = —C;. For a three-dimen- 
sional space 
° 0 A, By od AoB, A;Bs3 — A3B, 
C,; || = — (A,B, —- J 2B;) 0 A2B; — A3Be 
—(AiB; — A3Bi) —(A2B3 — A3B2) 0 


The nonvanishing terms correspond to the components of the vector product. 


Problems 


1. If the components of a tensor are zero 1n one coordinate system, show that the 
components are zero 1n all coordinate systems. 


: dx dah 5 : 
9. if JaB = gpa and Q:, = JaB ag OF? show that Jr, = Gyr. 


8. From gas = = (Gos + gBa) + = (Jas — JBa), Show that 
gag dx® dx? = 5(gas + gpa) dx® db 


4. If Aj}j is an absolute tensor, show that A}? is an absolute scalar. 
5. If Acs is an absolute tensor, and if A**’ Ag, = 6%, show that A? is an absolute 


tensor. The two tensors are said to be reciprocal. 
6. Show that the cofactors of the determinant |a,,| are the components of a relative 


tensor of weight 2 if a,; is an absolute covariant tensor. 
7. If A‘ is an absolute contravariant vector, show that. oe 1s not. a mixed tensor. 


8. Assume A}j is an isotropic tensor. Show that (3.34) holds. 
ox |? : 
9. Use (1.26) and |g;,| = |g.,| | = to show that |9,;|4 €.,. is an absolute tensor. 





TENSOR ANALYSIS 93 


3.6. The Line Element. For the Euclidean space of three dimensions 


we have 
ds* = dx? + dy? + dz? (3.35) 


The simple generalization to a Euclidean n-space yields 


ds? ie (dx)? + (dx?)? + ms SS + (dx)? 
= bap dx* dx? bag = Oifa XBsb48 =1lifa=B (3.36) 
If we apply a coordinate transformation, x‘ = 2r*(Z%!, #7, ..., 2"), 
; 2 Ox 5. ox 1, 
+=1,2,...,n, we have dz = Rn di, dx’ = oF dz’, so that (3.36) 
takes the form 
dx ax8 
2 izes Be ae Cort a an my 
7 ds ap OTH Or’ az dt 
= g,, dE" dz” (3.37) 








3 
a ga8 dx* dx", 
where §y, = Sas or = = » ee Riemann considered the general 


quadratic form 
: ds? = Jag(x) dx* dx? (3.38) 


This quadratic form (the line element ds?) is called a Riemannian 
metric. The gas(x) are the components of the metric tensor (see Exam- 
ple 3.4). A space characterized by the metric (3.38) is called a Rie- 
mannian space. 

Theorems regarding this Riemannian space yield a Riemannian geom- 
etry. Given the form (3.38), it does not follow that a coordinate trans- 
formation exists such that ds? = da, dy*dy®. If there is a coordinate 
transformation zt = r*(y!, y*, . . . , y"), such that ds? = 6.2 dy* dy®, we 
say that the Riemannian space is Euclidean. The y coordinate system is 
said to be a Euclidean coordinate system. Any coordinate system for 
which the g,, are constants is called a cartesian coordinate system (after 
Descartes). 

We can choose the metric tensor symmetric, for 


I = 5 (Yu) + On) + 5(Grs — Jj) 


and $(9i) — 9;;) dx'dz’? = 0. The terms 3(g,, + g;) are Symmetric. We 
assume that the quadratic form is positive-definite (see Sec. 1.6). 


Example 3.6. In a three-dimensional Euclidean space using Euclidean coordinates 
one has 
ds? = (dx1)? + (dx)? + (dzx*)? 
1 0 0 
so that llgss|] = 70 1 O 
0 0 1 


94 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
For spherical coordinates 


zx! = rsin 6 cos ¢ = y! sin y? cos y® 
z? =r7rsin 6sin g = y) sin y? sin y® 
z* =rcos 0 = y' cos y? 


The §;, of the spherical coordinates system can be obtained from 


dx axh 
i 1 2 3 
fuily', y?, y®) = gap(r(y)) Oy ay 


Hence a = a) + (S)' + (3) = 
ws oy} ay! Oy! 


Gjx2 = (y!)? 
gas = (y?)?(sin y?)? 
g., = 0 for? #7 
We obtain 
ds? = (dy!)? + (y!)*(dy?)? + (y! sin y?)?(dy®)? 
= dr? + r? dg? + 7? gin? 6 dy? 


This spherical coordinate system is not cartesian since go. ~ constant. 

Example 3.7. We define g‘? as the reciprocal tensor to g,,, that is, g’gj, = 6. (see 
Prob. 5, Sec. 3.5). The g*? are the signed minors of the g,, divided by the determinant 
of the g.,. The g*’ are the elements of the inverse matrix of the matrix ||g,,||._ For the 
spherical coordinates of Example 3.6 we have 


1 0 () ] 0 0 
] 
Q r () QO — 0 
gel] = oy = ho = | 
2 In? | 
O Or? sin? 6 0 O Paine | a5 a | 


Example 3.8. We define the length L of a vector A* in a Riemannian space by the 


quadratic form 
L? = gapA%A8 (3.39) 


The associated vector of A‘ is the covariant vector 
A. = gra A% 
Tt is easily seen that At = Ap so that 
L? = gapA*A® = gPAgAg (3.40) 


The cosine of the angle between two vectors A‘, B* is defined by 


7 g.,;A'B: 
C08 8 = ypA@AB)M gus AMAA oon) 


Let the reader show that |cos 6| < 1. 


Problems 


1. Prove (3.40). 
2. Show that |cos 6] < 1, cos 6 defined by (3.41). 


TENSOR ANALYSIS 95 
3. For paraboloidal coordinates 
z! = yly? cos y* 
z? = yy? gin y? 
x? = 5[(y!)? — (y?)% 
Show that 
dg? = (dx1)? + (dx*)? + (dz*)? = [(y!)? + (y?)*][(dy)? + (dy*)?] + (y'y?)*(dy?)* 


4. Consider the hypersurface z* = 2‘(u!, u?) embedded in a Riemannian 3-space. 
If we keep u! fixed, u! = uj, we obtain the space curve z‘ = z'(u}, u*), called the we 
curve. Similarly z* = z*(u', uj) represents a wu: curve on this surface. These curves 
are called the coordinate curves of the surface. Show that the metric for the surface 


; are ax x8 . 
is ds? = h,; du‘ dui, where hi; = gap — — Show that the coordinate curves 


intersect orthogonally if his = 0. 
6. The equation ¢g(z!, x2, . . . , 2*) = 0 determines a hypersurface of a V,. Show 


that the =f are the components of a covariant vector normal to the surface. 


6. Show that es is a unit tangent vector to the space curve 2*(s), ds? = gag dx® dx*, 


7. Show that the unit vectors tangent to the u! and uw? curves of Prob. 4 are given 


Ox* ox* 
by Ay Naa hoe er 


8. If 6 is the angle between the coordinate curves of Prob. 4, show that 
cos 6 = (Aisho2) thie 
3.7. Geodesics in a Riemannian Space. If a space curve in a Rie- 


mannian space is given by 2 = z‘(¢), we can compute the distance 
between two points of the curve by the formula 


t a 4 
ee | | (oustz() a dt (3.42) 


The geodesic is defined as that particular curve z‘(é) joining z‘(to) and 
z'(t;) which extremalizes (3.42). The problem of determining the 
geodesic reduces to a problem in the calculus of variations. We apply 
the Euler-Lagrange formula to (3.42) (see Sec. 7.6). The differential 
equation of Euler-Lagrange is 


dfof\ of _ 
dt (2) an 8 oo 
where f = (gast*z*)t. Now 

Of _ 1 gag ees 

dx’ 2f dz a 


nn oe 








96 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


If instead of using ¢ as a parameter we switch to arc length s, then ¢ = s 
and f = 1. Hence 

of = 1 09a ice 

act 2 da” i 


d of _ 1 Be a I OJar Ogug “er ef 
( ) => (Gait + g.gt*) + 5 ( + ae) 5 L 











dt \ax' 
so that (using the fact that g;; = g;;) (8.43) becomes 


we 1 1 (99a , 99:8  P9op\ 2.6 _ 











Multiplying by g™ and summing on 7 yields 








d*a7 vs O9ai 09:8 = Jap dx* dxf ; 
ds? + 3g (5 + dz* = Oxi J ds ds : Ses 
d*x7 , ax dx 
or dae “8d, de 9 r=1,2,...,n 
/( 99a , 998  I9Gag 
roe Lari [ U9 ot ee oe) ee / 
where 73 = 39 ( aa aaa wet (3.44) 


Equations (3.44) are the second-order differential equations of the geo- 
desics or paths. The functions It, are called the Christoffel symbols of 
the second kind. 


Example 3.9. For a Euclidean space using cartesian coordinates we have g., = 
991 
axk 


= Q, or 2’ = a’s + b’, linear paths. 


constant so that 
ds? 

Example 3.10. Assume that we live on the surface of a right helicoid immersed in a 
Euclidean 3-space, ds? = (dx1)? + [(x!)? + c?](dxz?)?. We have 


1 0 | 
1 
" Gy +e 


= Oand (3.44) yields [Gg = 0. Hence the geodesics are given by 





1 0 
igil = lo (x1)? + ot lg*’| = 


Applying (3.44’), we have 
a a a 
2 lpn (OGu 4 Oe _ 2) 
Vie = 39 (% T Sci ~ Oa 


Since g?* = 0 unless 2 = 2, we have 


Lote (2 Ogee _ Ogre 


Ox? ox} oz? 


Te 
Fig Se Pe i 
~ Q(x)? + oe] “" ™ (eh)? + 
Similarly, 


zx} 
My, = 0, Ty, = 0, Ti, = Tn = 0,T = ~2,T, = 0, Ty, = Th = [(z?)? + c* 


TENSOR ANALYSIS 97 
The differential equations of the geodesics are 
2 st (M2) 4 ort, 28 4, (2) 
‘ds? ae (F) + 2012 ds ds + Po ds =e 
d2x? dz} dx! dz dx?\?2 
a tth(S RY! park Go Ge + Th =) = 
d2z} , (ax? 
ast ~* (a) ° 
i d’x? | at dat dx? 0 
ds? (x1)? + c2 ds ds 


(3.44/’) 


Integrating the second of these equations yields 


o 
— [(z1)? + c?] = constant = A 


A 


I\ 2 
Let the reader show that (=) + (e)? +o = constant. 


Problems 


1. Derive the Tg, of Example 3.10 
2. Find the differential equations of the geodesics for the line element 


= (dx')? + sin? x2} (dx*)? 
, bee eee ne dx} dx? 
Integrate these equations with initial conditions x} = 60, x? = go, —— ie 1, a 0 at 
s = 0. 
8. Show that IZ = Vf. 
4. Obtain the Christoffel symbols and the differential equations of the geodesics for 
the surface 
x! = u! cos u? 
r? = u) sin u? 


r= Q 


The surface is the plane z* = 0, and the coordinates are polar coordinates 
ds? = (du)? + (u')2(du2)2 
6. From (3.44’) show that 


O9ap 
ei = gopl oy oF Jacl By 





6. Let ds? = Edu? + 2F dudvy + Gdv*?. Calculate |g;,|, ||g*?||, 


ax? ax OF! a2x7 az" 
7. If T%,(2) = 18, (z) == aE) ag axe + OEIgE* oa?’ show that Tg, — I'fg is a tensor. 


8. Obtain the Christoffel symbols for a Euclidean space using cylindrical and 
spherical coordinates. 

9. The Christoffel symbols of the first kind are {i, jk} = gic. Show that 
‘ a 

7k = g’T (4, jk}. 


98 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


3.8. Law of Transformation for the Christoffel Symbols. Let the 
equations of the geodesics be given by 


dx, . daidxt _ 
ds? + Tye(2) ds ds ? oy) 
ae dB dE | 
and ds? + T,(2) ae aa = 0 (3.46) 


for the two coordinate systems x’, Z* in a Riemannian space. We now 
find a relationship between the I}, and the [™,. From Zz‘ = Zz) we have 
dt ardes dE _ OE dew da? | OB dat 
ds dx% ds ds? dx? da* ds ds — dx* ds? 





Substituting into (3.46) yields 

am diet | OP dvds? , 5, apattderdrt _ 

axe ds? © dx8 dx ds ds 7 
Multiply this equation by = and one obtains (after summing on 12) 
ds? ak axe Ox aX © O28 dx7 OF] ds ds 
Comparing with (3.45), we have 


. me OF ORY az" | O'E" Az 
Te = Uby aor gt age + gr aut aR (3.47) 





Equation (3.47) is the law of transformation for the Christoffel symbols. 
Note that the I}, are not the components of a mixed tensor so that T'},(z) 
may be zero in one coordinate system but not in all coordinate systems. 
Let the reader show that 


: ~ OX Ax OF d°7" az! 
Ue = Thy oor Gg ane + Se am Oa" a) 





Example 3.11. From Prob. 4, Sec. 1.2, and the definition of the g*? we have 


9\gsrl ae ap 99a 
From Prob. 5, Sec. 3.7, 
O9ap 
age = 9ebT cy + Goal By 


é ln | 
so that one = 9 oea0e, + oP gacl ay 
= 530%, + 8 $y 
= Png + Te, = aT on 


é In \gs i 
Hence ae ae oe (3.49) 


TENSOR ANALYSIS 99 


Example 3.12. Let us consider a Euclidean space using Kuclidean coordinates. In 
this coordinate system I'},(z) = 0. From (3.48) 


a (a) = ote oF 
ak a2? az* ax° 


; ae ; 077" i 
for the £ coordinate system. If the I’, also vanish, then 5B og = 0 of necessity, so 


that 
xv = alg + b’ 


where a%, b° are constants of integration. Hence the coordinate transformation be- 
tween two cartesian coordinate systems is linear. If the transformation is orthogonal, 


zh 
reduces to 


|see (1.53)]. For orthogonal transformations §,., = gag = = 


1 O08 oR OR 
Oz! Ox" OF arb 
so that bu, aon OaB Om aoe OE 
OF: axe _ act 
ane = Fable oR; = ap oo 


Now let us compare the laws of transformation for covariant and contravariant 
vectors. We have 
- Oz - ox 
i= a = — 
A‘=A apa A, = Aa aa" (3.51) 


Making use of (3.50) yields 


ox® 
+= Qe 
A S Aa Se (3.52) 
a=] 
so that orthogonal transformations affect contravariant vectors in exactly the same 


way covariant vectors are affected. This is why there was no distinction made 
between these two types of vectors in the elementary treatment of vectors. 


Problems 
1. Prove (3.48). 
2. By differentiating the identity g’“ge; = 8, show that 


ag** : 
ar? = ely: = gh Ty, 


3. Show that the law of transformation for the Christoffel symbols is transitive. 


4. Derive the law of transformation for the Christoffel symbols from 


— 7, Ort ox 
G1 = Gab og oR 


100 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
6. Define 93; (2) = u(xz)g.,(7). Show that 
r(x) = I, (zr) + end! + 9,5 — g'Ginee 


1oln ; : 
where gg = 3 ae and I}, are the Christoffel symbols for the 9531 9x3, respectively. 


3.9. Covariant Differentiation. Let us differentiate the absolute covar- 
iant vector given by the transformation 


Ox 
Ai Aa oe 
with respect to an absolute scalar ¢ = #. We obtain 


dA, dA, dx a°x" Zt 


dt aR * ~* ag az dt ae) 





dt 


since ¢ = é. It is at once apparent that (Fs not a covariant vector. 


We wish to determine a vector (covariant) which will reduce to the ordinary 
derivative in a Euclidean space using Euclidean coordinates. We accom- 
plish this in the following manner by making use of the transformation 

oo mk BAP 
law for the Christoffel symbols: Multiply (3.48) by 4, = A, oe 


to obtain 








rid OE npg 4, O0t Buf da OE dz | 4 Out am deat att 
we dt P  * ak OF AF dx dt * az Ox? OF az’ dt 
= J2 A xt dx OP rh ax" 
a OR dt "az az* dt 
Subtracting from (3.53) yields 
dA; _ 7 at dA, ‘ o) Ox" 
at TRA aT = (Se ire Ge ) ag as) 
Hence as —T@A. o is @ covariant vector. 
For Euclidean coordinates ['¢, = 0so that this covariant vector becomes 
Y 
the ordinary derivative oe - We call as —T2,Aa ast the intrinsic 


derivative of A, with respect to ¢. Its value depends on the direction 


ne yg ax : 
we choose since it involves Te We write 


6A, = dA, — a dx 
a ( di Tad =) (3.55) 


5A, (OA, | da 
ot (34: rs,4c) dt 





Since 





TENSOR ANALYSIS 


101 
is a vector for arbitrary =~ it follows from the quotient law that 


oA, 
an? ~ Tide 





(3.56) 
is a covariant tensor of rank 2. We call (3.56) the covariant derivative 
of A,, and we write 


dA, 
Any = oer ore |e Ae (3.57) 
The comma in A,,, denotes covariant differentiation 
We now consider a scalar of weight N 


N 
ee Ox 
We have 


az 





A 





0A | 


dx \* 0A Ox 
Oz 


~ \88, on oF T 
From Prob. 9, Sec. 1.2, we have 


dx 
Ox 
Oz A 











dx |" 
Oz 

















a 
: Ox 
Ox! 
N 9A Ox? 
Ox® OF! 7m 


0 








Ox 
Oz 





OF 9% 
Ox? AF OF 


dx |" dF" A*x? 

af] 3x8 _. ee 
ee a ox 0x" 
Multiplying [2, = Tv, — a 4+ .—— oF os 
tracting from (3.58) yields 











so that es 


0A Out 
Oz! 


Oz 


























aA ,_ 
sa ~ NAT, = 





ax iV [AA ; \ Ox 
aE (34 - aol 
of 

Hence the invariant (in form), 5 —— — NAY 
weight NV. We write 


ag) 


is & COvariant vector of 


_ 0A ; 

Aw = a NAY, (3.59) 
We call A.. the covariant derivative of A. The comma in A,, denotes 
covariant differentiation. If A is an absolute scalar, N = 0 and 


the gradient of A. 


102 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


a1a2 


In general, it can be shown that if T$ig:...g,(Z) i8 a relative tensor of 
weight N, then 


aa: o _ OT Bigs--- por: Arya Se aatncte Praia: «+ wrvarr 
T6:8+-- Bam = oe ee BiB: ++ Bol um + + Téigs--- BL um 


— Taps.-p. Vom — ° °° — Upigs---p V'Bam 
— NT7$i6s-:-B.Lum (3.60) 
is a relative tensor of weight N, of covariant rank one greater than 
Tois:...6.- 1 'BiBr---B.m i8 Called the covariant derivative of Tig.-::,- 
Since the covariant derivative is a tensor, successive covariant differ- 


entiations can be applied. 


Example 3.18. We apply (3.60) to the metric tensor g,,. We have 


0 “) 
Jr)k = oe — gui, — Jipl', = 0 
from Prob. 5, Sec. 3.7. 
Example 3.14. Curl of a Vector. Let A, be an absolute covariant vector. We 
have 


A, 0A, 
Aus = 35 — Aaland Aya = $5 — Aa, 
oA. — aA, is a tensor. ‘It is called the curl of A and is a 








sO that Avs oe Ajy = az! ox" 


covariant tensor of rank 2. Strictly speaking, the curl is not a vector but a tensor of 
rank 2. In a three-dimensional space, however, the curl may be looked upon as a 


vector [see (3.10)]. If As = S©) then curl A; = 0. Conversely, if curl A; = 0, then 


oe 
A, = aa where [see (2.79)] 


zw 
g(x}, 27, ...,2") = i Ai(z}, 2*, . . . , a") dx! 
0 
xz? 
+{ A(x, u2, ... , 2") dx? 
xo? 
a” 
: + ee +f An(2x, rs; 45 8 Ry 2, x”) dx” (3.61) 
o” 
The constants x5, 22, . . . , 2 can be chosen arbitrarily. 


Example 3.15. Divergence of a Vector. The divergence of an absolute contra- 
variant vector is defined as the contraction of its covariant derivative. Hence 


div At = AY = 





a 


lk 
Mi = eae. lg] = |gs,|, 80 that 


From (3.49), aa 


+ |gi-tae el at 


(3.62) 
(jgit A) 


TENSOR ANALYSIS 103 


If we wish to obtain the divergence of A,, we consider the associated vector At = g'*Agq. 
The A® of (3.62) are the vector components of A. To keep div A‘ dimensionally cor- 
rect, we replace the vector components of A‘ by the physical components of A*. For 
spherical coordinates 








1 0 o | 
lg|t = r? 0 =r? gin 6 
0 QQ. r*® sin? @ 
so that div A* = a E (r2 sin 6A”) +4 — a sin 0A °) +2 — aan sin oar) | and chang- 


ing to physical components (see Prob. 4, os 3.2) 


1 


. Oi 
div A r2 gin 6 


a 2oj r 9g i 6 ao 
|2 (r2 sin 0A") + Ey (r sin 0A®) + ae (rae) | 
Example 3.16. The Laplacian of a Scalar Invariant. The Laplacian of the scalar 


invariant ¢ is defined as the divergence of the gradient of ». We consider Hay asso- 


ciated vector of the gradient of ¢, namely, g% =. Applying (3.62) to goof 8 £ yields 
the Laplacian 


1 oO 0g 
4 = 2 a Se aB xy e 
Lap @ = Vip = 22 (Ioibor 5 (3.63) 
In spherical coordinates 
1 O 0 
] 
lait =r28in@ ig =]9° 2 
1 
r? sin? @ 
1 a 1 oaV 
2 = 2 — ms reece retary 
a tana & ¢ ane ar) + 3g (am es d¢ \sin 0 d¢ | 
Problems 
1. Starting with A‘ = A@ — show that A‘; = 3 + A°r,, is a mixed tensor 
without recourse to (3.60). 
2. Show that (A'B,),. = A‘B,. + A‘,B). 
8. Show that (gieA%),, = fiaA%. 
4. Show that |g:,|.4 = 0. 
5. Show that 5), = 0. 
6. Find the Laplacian of V in cylindrical coordinates. 
dA; 
7. Show that Aj, = —— + Vy,At — 1,4; for an absolute mixed tensor Aj without 
recourse to (3.60). 
1 @ 
ge ees 4A) ars 
8. Show that At, git axe (\g|?A%) — Agr 


9. Write out the form of A%, (two covariant differentiations). 


5A, oe dx 
ot = + A, 2 "dt Show that 


ar - + v ‘vi is the acceleration vector if v' is the velocity vector. 


ze: If A;(z, ¢) is a covariant vector, show that —— 


104 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


3.10. Geodesic Coordinates. Since the Christoffel symbols Tj, are not 
the components of a tensor, it may be possible to find a coordinate system 
in the neighborhood of x‘ = q‘ such that T',(q) = 0. We now show that 
this can be done. Let 


HE = (at — gt) + 3Voe(q) (a — 9%) (2? — gf) (3.64) 


a = 1. Hence (3.64) is nonsingular in a 








so that Ld 

OX! |g 
neighborhood of z = q. The point x = q' corresponds to the point 
zi = 0, that is, the point 7 = q now becomes the origin of the % coordi- 
nate system. Differentiating (3.64) yields 


= 6: and 














, OF Oz’ : a pee OFF 
= ay «OF + Teg) (x a (3.65) 
since T:,(9) =T:,(9). Thus 3: = oe Differentiating (3.65) with 
“ @ 
respect to Z* yields 
0*x' 2 per Oe On ; egy OUP 
0 = ag oF + T4e(q) jz Og + Tee(q)(z* — g ) SF on 
0°z" . 7, 0% { OxF 
so that ag am |, ~ ~T=0 at |, a |, 
= —Tas(q)5¢07 = —Ti(g) = —Ti.(@) 
i es Ox® Ox? OZX* 0°22" Ox 
Now Te@) = TEl@) Sos oat dae + Oe OF dx" 


and evaluating at z* = q', Z' = 0, yields 


Ts,(0) = 1$,(q) 55676, — TH.(q) 63 
= Tig) — T3.(q) =0. Q.E.D. 


Any system of coordinates for which (Ti,)p = 0 at a point P is called 
a geodesic coordinate system. In such a system of coordinates the 
covariant derivative, when evaluated at P, becomes the ordinary deriva- 
tive when evaluated at P. For example, 


(At)p = (24) + (T;)P(A%)p 


_ (adi 
~— \ dn? }p 


since (I;)p = 0, if the z* are the coordinates of a geodesic coordinate 
system. 


TENSOR ANALYSIS 105 


We now show that 


(Ai + Bi, = Ai, + By 
(A‘B’), = A‘Bi, + As,Bi 


(3.66) 


System (3.66) is true in geodesic coordinates at any point P. But if two 
tensors are equal to each other (components are equal) in one coordinate 
system, they are equal to each other in all coordinate systems. Hence 
(3.66) holds for all coordinate systems at any point P. 

Equation (3.64) yields one geodesic coordinate system. There are 
infinitely many such systems since we could have added 


A‘(q) Gapy(q)(a* — g%)(x® — gf) (x7 — Q7) 


to the right-hand side of (3.64) and still have obtained T:,(0) = 0. 
A special type of geodesic coordinate system is the following: Let us 
consider the family of geodesics passing through the point P, xz = x}. 


Each & = oe . determines a unique geodesic passing through P. This 


follows since the differential equations of the geodesics are of second order. 
Suitable restrictions (say, analyticity) on the I',(z7) guarantee a unique 
solution of the second-order system of differential equations when the 
initial conditions are proposed. The £ are the components of the tan- 
gent vector of the geodesic through P. We now move along the geodesic 
(determined by £) a distance s. This determines a unique point Q. 
Conversely, if Q is near P, there is a unique geodesic passing through P 
and Q which determines a unique £' and s. We define 


B= fs (3.67) 


The Z* are called Riemannian coordinates. A simple example of Rie- 
mannian coordinates occurs in a two-dimensional Euclidean space. 
There is a unique geodesic through the origin with slope m = tan @. 
The r coordinate of polar coordinates corresponds to the s of Riemannian 
coordinates. Thus (£1 = cos 6, é? = sin 6), r= s, and zx! = rcos @, 
xz? =rsin 6. In this case the Riemannian coordinate system (Z!, #7) 
corresponds to the Euclidean coordinate system (2, y). 

The differential equations of the geodesics in Riemannian coordinates 


are 
ae dB dz 


age as aa 
dx ax" 
But 7 = ast = 0, so that 
Ti,(z) He = 0 (3.68) 


Since (3.68) holds at the point P (the origin of the Riemannian coordinate 


106 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


system, #' = 0) for arbitrary ¢, it follows that T,(0) = 0. Hence a 
Riemannian coordinate system is a geodesic coordinate system. 
8.11. The Curvature Tensor. We consider the absolute vector V*. 
Its covariant derivative yields the mixed tensor 
ov? 


Vw= Sot VT, 


On again differentiating covariantly we obtain 





Vink o al ae ra = VG 
_ 8 | ave 8 av 
“ae * git e i +(Z mele - 


av? 
MELLIN 7p 


Interchanging k and j and subtracting yields 





Vin ~ Vi. = Ve Bey (3.69) 
om, or , 
where ‘eg k = “oxk — a + re yl be — Pe,Ts, (3.70) 


A necessary and sufficient condition that V',, = V‘,, is that Rt, = 0. 
It follows from (3.69) and the quotient law that Ai,, 1s a tensor. The 
tensor Ff, depends only on the metric tensor of the space. It is called 
the curvature tensor. Its name and importance will become apparent 
in the next two paragraphs. The contracted curvature tensor 


are, ag 


ae ae Tele = TOE (3.71) 


Ri; = Kz, = 





is called the Ricci tensor and plays a most important role in the general 


theory of relativity. 
The scalar invariant R = g”R,, is called the scalar curvature. The 


tensor 
Rak = Jha Re, (3 . 72) 


is called the Riemann-Christoffel, or covariant curvature, tensor. 
Problems 


1. Show that, for Riemannian ula ste Tj, (222 = 0. 

2. Show that (A$ + Bt), = At, + Bi,. 

8. Show that R, = Ry. 

4. Show that Rii,6 + Roe, + Roec,, = 0. This is the Bianchi identity. Hint: 
Use geodesic coordinates. 

5. Show that Rasy == — Rinjk = — Risk, Rue = 0, Ry jek = (). 

€. If R., = kg:,, show that R = nk, n the dimension of the space. 


TENSOR ANALYSIS 107 


3.12. Euclidean, or Flat, Space. If a space is Euclidean, there exists 
a coordinate system for which the g,, are constants, so that T;,(7) = 0 
ors, 
oa! 
(all its components are zero). We now show the converse. If the 
curvature tensor is zero, the space is Euclidean. Let us first note that 
if an x coordinate system exists such that the g,,(x) are constants then 
I,(z) = 0, and conversely. If there is an x coordinate system for which 
T(z) = 0, then 





+, Vanishes 





07x" dy’ 


oy dy* dx one) 








M.(y) = 


and conversely, if (3.73) holds, then I%,(7) = 0. Our problem reduces 
to the following: Given the Christoffel symbols T,(y) in any y coordinate 


system, can we find a set of x‘,2 = 1, 2,3, . . . , n, satisfying (3.73)? 
The system of second-order differential equations (3.73) can be written 
0°x° OX 


We reduce (3.74) to a system of first-order differential equations. Let 





a = ue (3.75) 
ous. , 
so that ay! = url. (y) (3.76) 


For each o we have the first-order system of differential equations given 
by (3.75) and (8.76). These equations are special cases of the more 
general system 


dz" k n n+1 1 2 n 

pe eee eee ee hae (3.77) 
with k= 1, 2,...,n+1;7=1, 2,...,n. If we let 2} = 2°, 
z2=ui,...,2t! = ug, (38.77) reduces to (3.75), (3.76). 


If a solution of (3.77) exists, then of necessity (assuming differenti- 
ability and continuity of the second mixed derivatives) 


dy! ° dz dy = dy? | dz dy? 
or OF}, _ aFE | OF, 
i+ oe age ~ Oy? T 9gH age i (3.78) 


If the F* are analytic, it can be shown that the integrability conditions 
(3.78) are also sufficient that (3.77) has a unique solution satisfying the 
initial conditions 2* = zi at y' = y}. The reader is referred to the 


108 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


elegant proof found in Gaston Darboux, ‘‘Lecons systémes orthogonaux 
et les coordonnées curvilignes,” pp. 325-3836, Gauthier-Villars, Paris, 
1910. 
The integrability conditions (3.78) when referred to (3.75), (3.76) 
become 
T jk Oa = Veug 
Raut = 0 (3.79) 


I 


The first. equation of (3.79) is satisfied from the symmetry of the T%,, 
and the second is satisfied if 22%, = 0. Hence a necessary and sufficient 
condition that a Riemannian space be Euclidean is that the curvature 
tensor vanish. 

3.13. Parallel Displacement of a Vector. Consider an absolute con- 
travariant vector A‘ in a Euclidean space. Using cartesian coordinates 
yields I,(z) = 0. We assume further that the A* are constant. From 


, On 
A‘) = * 9x2 
we have 
z ax’ ah 
dA*t = Ae Fr aqe apy IT 
7, O°F Ax® axe 
Sr dat OFF Ore 
Moreover 


074% Oz 
ve OEY OE Ax 
0°z* odx® dx% 


ce 


ox® dx% OEY OF" 


since T',(z7) = 0 


from Example 1.4. Hence 
ddi = — Ket, ae (3.80) 


Now the A‘, being constant in a Euclidean space, can be looked upon as 
yielding a uniform or parallel vector field. Equation (3.80) describes how 
the components of this parallel vector field change in various coordinate 
systems. Since, generally, a Riemannian space is not Euclidean, we can 
use (3.80) to define parallelism of a vector field. 

We say that A‘ is parallelly displaced with respect to the Riemannian 
V, along the curve z* = x*(s) if 

dA‘ , ax 

a — AT%, ab (3.81) 
6A’ 


or = 
ds 0 








TENSOR ANALYSIS 109 


We say that the vector suffers a parallel displacement along the curve. 
If a vector suffers a parallel displacement along all curves, then from 
dA‘ dAtdzé 


ae oe aa it follows that 


or A‘; aa 0. 


Example 3.17. Let us consider two unit vectors A‘, B* which undergo parallel 
displacement along a curve. We have 


cos 6 = gag A%BB 


5(cos 0 5Aa §B8 
and oes ®) = gap —— BP t+ gapA® —— = 0 
b6A2 6B8 e e 
since gas,y = 0, oo 0, os 0. Hence, if two vectors of constant magnitude 


undergo parallel displacements along a curve, they are inclined at a constant angle. 


Two vectors at a point are said to be parallel if their corresponding 
components are proportional. If A* is a vector of constant magnitude, 
the vector B* = ¢A', » = scalar, is parallel to A‘. If A‘ is also parallel 


with respect to the V, along the curve x’ = z‘(s), we have — ria = 0. Now 


6s 
6Bt 6 At Ata 2 : 
fe? ts tae aaa 
_ldg 
Sie 


We desire B* to be parallel with respect to the V, along the curve. Hence 
a vector of variable magnitude must satisfy an equation of the type 


oe = f(s)B (3.82) 


if it is to be parallelly displaced along the curve. 


Example 3.18. The curvature tensor arises under the following considerations: 
Let z' = z(t), O S$ ¢ S 1, be an infinitesimal closed path. The change in the com- 
ponents of a contravariant vector on being parallelly displaced along this closed 
path is 

AAt = — $T%,A*% dx8 


If we expand A, jg in a Taylor series about z, = z*(0) and neglect infinitesimals of 
higher order, it can be shown that 


AAt = (Ri Yay )(A®) 0 S27 dz8 — 2 dr (3.83) 


where Rig, is the curvature tensor of Secs. 3.11, 3.12. 


110 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problems 


1. Show that the Riemannian space for which ds? = d6? + sin? 6d¢g? is not Eu- 
clidean. 

2. Derive (3.79). 

8. Show that the unit tangent vector to a geodesic suffers a parallel displacement 


along the geodesic. 
4. If B* satisfies (3.82), show by letting Bt = yA* that it 1s possible to find y so 


fine = eo 
68 
6. Derive (3.83). 


3.14. Lagrange’s Equations of Motion. Let L be an absolute scalar 


invariant of the space coordinates (z!, .r?, . . . , 2”), their time deriva- 
tives (41, 2, . . . , %n), and the invariant time ¢,¢ = ¢. Hence 
LN RF, sag Oy Oe eG 8D 

SA Se cates ee oe, eh) 


under the transformations 


BS OG ee 4k ee 3") (S92. 2 kag 


a] (3.84) 
From (3.84) 
i = = ge = F(z, #) 
Ot OL ag 
Of ak! are 
Ot! _ Oa dae _ Ou’, _ Ou 


when it is assumed that the Z' and #* are independent variables as far 
as L is concerned. Now 


ab _ aL dae | al aie 
Of = Ox* OF* | Ox OF 
~ OL dx OL, o*x2 2.9 











= one of + dae aR age? e209) 
ees ol _ ab ase _ al. ox" 
Ot =O OF OE™ OF" 
d (aL d (aL 
so that $ (2%) = Ti (3) 
d fol \ dz% OL, o*x2 
Se b ee, ee ee a OD 
at (2) ap * aa- on an 7 (3.86) 


Subtracting (3.85) from (8.86) yields 


d faL aL aa | d f OL aL 
4 (2) ~ OR ~ 8F ls (2) ~ 3 | 00 





TENSOR ANALYSIS 111 


Equation (3.87) implies immediately that 


a (ab) _ ab 
dt \ az Ox" 
is an absolute covariant tensor. 
Let us consider a system of n particles, the mass m,,7 = 1,2, ... ,7, 


located at the point x7, a = 1, 2, 3. We assume that the coordinates 
are Euclidean and that Newtonian mechanics apply. Let 


ie 1 2 3 1 2 3 1 2 3 
J (z}, vy; Ty, U9, 9, 9, Oo Oo. er 29 Uns Uns 73) 


be the potential function such that 
aV 


Ox’ 


P= — 


represents the rth component of the force applied to particle s. In 
2 


: : d?x° 
Newtonian mechanics Ft = m, ae for Euclidean coordinates. The 


kinetic energy of the system is defined as 


n n 
2 
ds ieee 
T= ) tm{(—) = ) tmgaptee8 
dt 
1 1 


~~ (= 





where gas = Sas. The Lagrangian of the system is defined by 








L=T-V 
= > FMNG apo x? =v 
w=] 
él ye mee eT 
Thus Bat MGark* = m,x" 
. an = MX; 
dt \ dx" =e 
OE 6 a OY 
dx” sé?” 
d {aL dL ap 1 OV 
so that di (2) = aat = ML, + aat = (0 


for Newtonian mechanics. Since 2 & ia is a vector, it vanishes 
dt \ az" ox" 

in all coordinate systems. We replace the x‘ by any system of coordinates 

(q', g?, . . . , @") which completely specifies the configuration of the 


112 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


system of particles. Lagrange’s equations of motion are 


d {aL aL 
¢ (4) -# =0 r=1,2,...,n (3.88) 
Example 3.19. We consider the motion of a particle acted upon by the force field 
F = —VV. We use cylindrical coordinates. 
L=T—V = 5m(i? + 7°62 +22) — V 
oL aVv 
so that = = m(r62) — — 
fees mr 
at\ar J 


and one of Lagrange’s equations of motion is 
mr — mr6? + ON 0 
or 


Since — or represents the radial force, the term r — ré? must be the radial accelera- 
tion. 


If no potential function exists, or if it is difficult to obtain the potential 
function, we can modify Lagrange’s equations as follows: Since the 
kinetic energy is a scalar invariant, we have that 


_ dad foT oT 
Q, = a (7) ae (3.89) 


is a covariant vector. Let the reader show that if the z* are Euclidean 
coordinates, then Q, is the Newtonian force. If f, is the force in a y 
coordinate system, then 


eps oe pee ae a 
Qr = fare Q, dx la dx" = f, dy 





so that Q, dx" is a scalar invariant and represents the differential of work 
dW, and 
ow 
= opr 
We obtain Q; by allowing x‘ to vary by an amount Az‘, keeping 21, 2’, 
., al atl 2... , a” fixed, calculate the work AW; done by the 
forces acting on the system, and compute 





Q; = lim all + not summed 
azo A 


Example 3.20. A hoop rotates about a vertical diameter with constant angular 
speed w. A bead is free to slide on the wire hoop, and there is no friction between hoop 
and bead. Gravity acts on the bead of mass m. We set up the equations of motion 


TENSOR ANALYSIS 113 


of the bead, using spherical coordinates. The force of the hoop on the bead is given 
by F = Re, + ey, R, & unknown, and the force of gravity is 


G = —mg cos 6e, + mg sin 0e9 
Hence 
Q, = R — mg cos 6 
Qe = mgr sin 6 
Qe = or sin 6 
From 7 = ¢m/(r? + 1262 + r? sin? 0¢?), we have 
or = 2 in2 2 d or = 
ae nee +r sin? @ 9?) aN\oar) =m 
Me 2 ERP) 8 oe 
59 er sin 0 cos 0% 5 (33 = Gy, (mr?) 
oT dfeaT\ _d oe 
ae = 0 di =) eT (mr? sin? 6 ¢) 


Equation (3.89) yields 


d*r rg? n?6o2) = 
ai (ré? + rsin? 6 ¢?) = 7 9 008 6 
£ (r26) — r? sin 6 cos 6 ¢? = gr sin 6 (3.90) 


d ; } . 
OY eat ae 
di (r? sin? 6 ¢) r sin @ 


The geometry of the configuration yields r = constant =r, 7 =r =0, ¢ =a, 
¢ = 0. The solution of the problem consists in finding 6(t) by integrating the second 
equation of (3.90). Once this is done, and # can be obtained from the other two 
equations. Let the reader show that 


1 2 aé 2 a hats 2 at pee : 
ar \ a row? sin? 6 = —gro cos 6 + constant (3.91) 
3.15. Hamilton’s Equations of Motion. From the Lagrangian L(z, <, t) 
we define 


er 
Pi = Op ee 2.4 4 ok on (3.92) 


We show that p, is a covariant vector, called the generalized momentum 
vector. Since L = L, we have 
Of ~~ Ok OF ~—Ox* OF 
=», ot 
1 Pa oR 


which proves our statement. We shall now use gq‘ instead of x‘ to repre- 
sent the coordinate system. The Hamiltonian is defined by 


H = Daq* ee L(q, q; t) (3.93) 


114 ELEMENTS OF PURE AND APPLIED MATHEMATICS 














From p; = 7 = F(q, g, ) we assume that we can solve for the ¢, 
i= 1,2,...,n,interms ofthe p,,q¢,,andtime. Thus the Hamiltonian 
H now becomes a function of the p’s, q’s, and time, 
H = H(p, q, t) 
dH ss age’, OL Ss OL OG! 
Hence aq’ = Pa ag dq age dq: 
_ — ob 
Se ae 
, aL ; 
since age = Pa. From Lagrange’s equations of motion, 
= i (ae 
dg dt\ag/ dt 
dp, _ oH 
SO that “Wi = aq" 
oH. age aL age 
nee ap, ft Pe Sp, ~ age ap. 
. _ ag 
~ oat 


from (3.92) and (3.93). 
Hamilton’s equations of motion are 


dg _ oH 
dt op, 

dp, aH (3.94) 
dt dq 


Whereas Lagrange’s equations of motion are, in general, a system of n 
second-order differential equations, Hamilton’s equations are a system 
of 2n first-order differential equations. 


Example 3.21. Referring to Example 3.19, 


= 2 = mr =r Pr 
ar eer mM 
aL ; Pe 
= = pt = = 
pee ae 6 geo = 6 as 
oL : Dz 
Ds 0z = mz Qs = zZ oon m 
and H = pag? — L 


2 


2 2 2 2 2 
-2, me m(e ne +2) + ve, 600 


mr? m 2 \m? mir? m3 





lf, Pe 2 
= Prt + Ps + V(r, 6, z, t) 


2m 


TENSOR ANALYSIS 115 


Hamilton’s equations are 





dr OH pr dp, OH _ Do aV 
dt OD, om dt dr mr? or 
d@ spe dpe _ OV 
dt mr? ‘dt ~—s 00 
dz Dz dp, _ aV 
di om dt — a 
d?r Da OV d0\: ov 
Hence n— =— —- — =mr —) -— 
dt? mrs Or dt or 
mdf ,d6\ _ 10V 
r di\" a) 7 OO 
d2z OV 
lee = eae 


These are Newton’s equations of motion for a particle using cylindrical coordinates. 


Problems 


1. Find the components of the acceleration vector in spherical coordinates and in 
cylindrical coordinates for a particle. 

2. A particle slides in a frictionless tube which rotates in a horizontal plane with 
constant angular speed w. Neglecting gravity, find the reaction between the tube 
and the particle. 


ryY 


8. If T = aas(q', 9, . - - , g")9g°q*, show that 27 = 7 


4. Set up Hamilton’s equations for a particle moving in one dimension under a 
Hooke’s law force, F = —k?r. 
5. Integrate (3.91) under the assumption @ = 180°. 





q*. 


3.16. Euler’s Equations of Motion. We discuss the motion of a rigid 
body with one fixed point. The reader is urged to read Example 
2.10 the results of which we shall use. The 2 coordinate system 
will be fixed in space, and the Z coordinate system will be rigidly attached 
to the moving body. We shall use the > sign to represent an integration 
over the complete rigid body. 

In vector notation the angular-momentum vector is defined to be 


H = <r X mv 
and in tensor notation 
Hy, = Dme,,~xs! (3.95) 
From (3.95) 
dH, 


dt = > ME X'L! + > Me, 4H! 


= » ee (3.96) 


since €4.2°2) = 6440¢' = —e,t't' = — ey t't? = 0. 


116 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The moment or torque vector is defined as 


2 
b=) rxt=)rxme 


2 
af e (3.97) 
i 2 ME, ,L'E! 
Hence “ = Ty (3.98) 
Now Ay = DME, ~x*} Z! 


since z? = w/a’ [see (2.21)]. Since e,, and w(t) are independent of the 
space coordinates, we write 


Ay = €x0] 2mz'z! 
We define J" = Smz'z' as the inertia tensor. Since z* = AtzZ*, we have 
a ) 


I = At AbomEer = Ai Abs 
* on" Ox! 
” ORF or 


The components of the inertia tensor relative to the moving frame 
are constants since the frame is rigidly attached to the rigid body. 
This is not the case for the J” since the a’, are, in general, functions of 
time. Remember that the a‘, are the direction cosines, which vary with 
time. Thus it will be useful to deal with the Z coordinate system. In 
this frame 

Ay = indo} (3.99) 


The é,;, transform like the components of an absolute tensor for orthogonal 


transformations since = |at| = 1 (see (1.25)]. Moreover 

















— Ox z 
HA, = om = AtH, 
dH, a oat 
a Ag—*-+—- 
= y” 
= [, it 
. , aa} 
From AYat, = 53, wk = — A} “i it follows that 
a” = AtASa' wt 


so that 





(AgH.)(Afaioy) = Haj 


TENSOR ANALYSIS 117 


Hence [using (3.99)]| 


End” ot ant Eqid Maj at — L; (3.100) 
Equation (3.100) is one form of Euler’s equations of motion. Since 
I“ is a quadratic form, we can find an orthogonal transformation which 
diagonalizes this tensor or matrix (see Sec. 1.5). Let us now use this 
new coordinate system (it is also fixed in the moving body). We omit 
the bars of (3.100) for convenience. We have 





I, 0 O 
I=/0 J, O 
0 O 7s 


It is easy to see that J, + J. = A,, the moment of inertia about the 
z axis, etc. The xyz coordinate system is fixed in the moving body. 
Equation (3.100) becomes, for k = 1, 


dw} 
él"! ar — €gyl“wlot = Ly 
dw dw? 
2 3 3: 
€o3:1 7? dt + €30:/ 38 a €r301 wsw? — esol Bwhw? — e€:o3] w?w3 


— €2131 wwe = Ly 


3 
(7? + 1%) ‘ -+- [(7?? + P) nics ( [38 + TD! )ahw? — Ly 


or at + (A, — A,)wyw, = Ly 
at is 
Similarly A, a + (A, — A,)ww, = L, (3.101) 
ee 


+ (A, — Az)wW, — L, 


Problems 
1. For a free body, L = 0, show that 


Aw? + Aywi + Aw? = constant 
and Aw 2 + Aiw? + Adw? = constant 


follow from (38.101). 

2. Assume that, at ¢ = 0 for a free body, wz, = wo, w, = w2 = 0. Describe the 
motion for ¢ > 0. 

8. Solve for wz, wy, we for the free body with A, = Ay. 

4. Show that the body of Prob. 3 precesses with constant angular speed about the 
angular-momentum vector. 

5. Derive the second and third equations of (3.101). 


3.17. The Navier-Stokes Equations of Motion for a Viscous Fluid. 
Let us first consider the motion of a fluid in the neighborhood of a point 


118 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


P(x) of the fluid. Let the velocity of the fluid at P be given by w', or 
U, = giu*. The velocity at a nearby point Q(x + dz) is, except for 
infinitesimals of higher order, 

u(az + dz) = u(x) + du, 


Ou, 
= u(x) + ee 


1 f/f Ou, OUg x 1 / Ou, Ole 7 
= we) +3 (GH ~ Gat) ae + a (Fis + Bis) 


The partial derivatives are evaluated at the point 7’. Strictly speaking, 
du, is not a vector, so that we should be concerned with the intrinsic 
differential 5u,. The above can be written 


I 


dx 








u(a + dx) = u(x) + F(a — Yar) dt® + (tha + Ua) dx* (3:102) 


We now analyze (3.102), which states that the velocity of Q is the sum of 
three terms: 

1. u(x), the velocity of P. Q is carried along with P, so that u,(z) 
is a translational effect. 

2. The term 3(t1,a — Ua.) dx* corresponds to a rotation with angular 
velocity @ = 3V Xu. A sphere in the neighborhood of P with center 
at P would be translated and rotated under the action of terms 1 and 2. 
It. follows that terms 1 and 2 are rigid-hody motions. 

3. Hence the term $(t.,¢ + Ua,.) dx* must be responsible if any deforma- 
tions of the fluid take place. 

We define 

Sy = ¥(U, + Uys) (3.103) 


as the strain tensor. Let dz’ = zx’ temporarily, that is, let P be the 
origin of our coordinate system, x’ the coordinates of the nearby point Q. 
We can write s,,2* = $V(s,,2'2’) using Euclidean coordinates. However, 
s,j0'x? 1s @ quadratic form which can be reduced to 


@ = 81:(E!)? + S20(E*)? + 833(F*)? 


under an orthogonal coordinate transformation. In the Z system, 
gVO = Sy FM + So2F?j + 533:%%k. Thus, along with the rigid-body motions 
of terms 1 and 2, occurs a velocity whose components in the Z!, £7, 
directions are proportional to Z!, £*, %*, respectively. The sphere sur- 
rounding P tends to grow into an ellipsoid whose principal axes are the 
F!, ¥?, F* axes. 

Terms 1, 2, and 3 characterize the motion of a fluid completely. In 
order to discuss the dynamics of a fluid, one must consider the stress 
tensor i”. We refer to Fig. 2.22. The face ABCD is in contact with a 


TENSOR ANALYSIS 119 


part of the fluid. This part exerts a force on the face. This force per 
unit area has components 7, it~, #”. The y refers to the fact that the 
normal to the face ABCD points in the y direction. By considering 
the other two principal faces one is led to the stress tensor t”, 71,7 = 1, 2, 3. 
The total force on a closed surface S is given by 


if tN, do (3.104) 


S 


where N, is the unit normal vector to the surface area dé. Let the reader 
show that (3.104) can be written as 


| f bi dr (3.105) 
R 


so that ¢? represents the force per unit volume due to the stress tensor 
t, The equality of (3.104) and (3.105) is essentially the divergence 
theorem of vector analysis. 

In order to obtain the Navier-Stokes equations of motion, we make the 
fundamental assumption that the components of the stress tensor be 
proportional to the components of the strain tensor. Thus 

th = ataeh (3.106) 


4 
We further assume that a‘Z is an isotropic tensor so that 
ars = k(x) b,6g + L(x) 6362 (3.107) 


(see Example 3.2). 
Combining (3.106) and (3.107) yields 


ti = In(a)b30gs8 + U(x) 5,652 


i 


= k(xr)6s% + I(x)s' (3.108) 
and tt = 3kst + Ist = (3k + l)s! 
In hydrostatics 
—p 0 0 
“fe =-| 0 —-p O 
0 0 -—-pD 


so that ti = —3p. The pressure p is defined to be p = —3t for the 
general case. Thus 


8p = (Bk + De ket = —p— set 


120 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Equation (3.108) becomes 


= — pdb; — c Biss + I85 


3 
and t= -Dy— «8, + ls}, 
l 
or t. = —pj- 3 g*"* Sak + lg*s)x., (3.109) 


From 8%, = $U;.0 + Ueg and 1 = 2y, (3.109) becomes 
= G1 = Ds; + SG" her ey a Vg" Uy er (3.110) 
In vector form 


F = —Vp+ x VV -u) + Vu (3.111) 


where F is the force per unit volume due to the stress tensor ¢. Let f 
be the external force per unit mass so that (F + pf) dz is the total force 
acting on the element dz. Newton’s second law of motion states that 


“ (pdrw) oe 
or pdr“ = (F + pf) dr 


since p dr is a constant during the motion. Hence 


du 


pa = ef — Vp + 3 V(V-u) + vV2u (3.112) 


where »v is the viscosity of the fluid. Equation (3.112) is the Navier- 
Stokes equation of motion for a viscous fluid. 


Problems 


1. For an incompressible fluid show that. 


du _ : 
p dt = pf Vp + »V4u 
2. Consider the steady state of flow of an incompressible fluid through a cylindrical 
tube of radius a. Let u = u(r)k, r? = 2? + y?. Show that p = p(z), »V2u = sp. 


dp 
= 

8. Solve for the steady-state motion of an incompressible viscous fluid between two 
parallel plates (infinite in extent), one of the plates being fixed, the other moving at a 
constant velocity, the distance between the two plates remaining constant. 

4. Find the steady-state motion of an incompressible viscous fluid surrounding a 
sphere which is rotating about a diameter with constant angular velocity. No 
external forces exist. 


Then show that u = o (r2 — a), = constant. 


TENSOR ANALYSIS 121 


REFERENCES 


Brand, L.: ‘‘ Vector and Tensor Analysis,”’ John Wiley & Sons, Inc., New York, 1947. 

Brillouin, L.: ‘‘Les Tenseurs,’’ Dover Publications, New York, 1946. 

Lass, H.: ‘‘Vector and Tensor Analysis,’’ McGraw-Hill Book Company, Inc., New 
York, 1950. 

McConnell, A. J.: ‘Applications of the Absolute Differential Calculus,” Blackie & 
Son, Ltd., Glasgow, 1931. 

Michal, A. D.: “Matrix and Tensor Calculus,” John Wiley & Sons, Inc., New York, 
1947. 

Thomas, T. Y.: “Differential Invariants of Generalized Spaces,’’ Cambridge Univer- 
sity Press, New York, 1934. 

Veblen, O.: “Invariants of Quadratic Differential Forms,’’ Cambridge University 
Press, New York, 1933. 

Weatherburn, C. E.: ‘Riemannian Geometry,’’ Cambridge University Press, New 
York, 1942. 


CHAPTER 4 


COMPLEX-VARIABLE THEORY 


4.1. Introduction. The reader is already familiar with some aspects 
of complex numbers. We enter now into a discussion of some of the 
simpler properties of complex numbers. In order to attach a solution 
to the equation z? + 1 = 0, the mathematician is forced to invent a new 
number, 7, such that 722 + 1 = 0, or i = ~/—1. We say that 7 is an 
imaginary number in order to distinguish it from elements of the real- 
number field (see Chap. 10 for a discussion of this field). The solution 
of the quadratic equation ax? + br + c = 0, a = O, requires a discussion 
of complex numbers of the form a + 87,aand 6 real. Theset of all such 
complex numbers subject to certain rules and operations listed below is 
called the complex-number field, an extension of thereal-number field. We 
note that the complex numbers are to satisfy the following set of rules or 
postulates with respect to the operations of addition and multiplication: 

1. Addition is closed, that is, the sum of two complex numbers is a 
complex number. 


(a + bt) + (c+ dt) = (a+e) + (6+ a) 


2. Addition obeys the associative law. Ifz, = a; + bit, z2 = ae + bot, 
Z3 = a3 + bs32, then 


Z1 + (22 + 28) = (21 + 22) + 23 
3. The unique zero element exists for addition. 
0=0O0+0-7 and z2+0=0+2=2 _ forany complex number z 


4. Every complex number z has a unique negative, written —z, such 


that 2+ (-—z) = (-z) +z=0. Ifz =a-+ br, then 
—z = (—a) + (—b)t = —a — bi 
5. Addition is commutative. 
21 + 22 = 22 + 2 


6. Multiplication is defined as follows: If 21 = a; + bit, ze = az + bet, 


then 
2122 = (Q1@2 — bibs) + (aibe + aebs)t 
122 


COMPLEX-VARIABLE THEORY 123 


The product of two complex numbers is again a complex number. This 
is the closure property for multiplication. 
7. Multiplication obeys the associative law. 


Z1(Z223) = (2122)23 = 212223 
8. The unique element 1 = 1 + 0-7 exists for multiplication. 
Leg = 21. = 2 for all z 


9. Every nonzero complex number z has a unique inverse, written 
z—1 or 1/z, such that zz = zz2=1. Ifz=a+bi,a?+ b? ~ 0, then 


po a Oe 
a? + 0° a? + 6? 


10. Multiplication is commutative. 
2122 = 2221 
11. The distributive law holds with respect to addition. 
21(22 + 23) = 2122 + 2123 (2; + Z2)Z3 = 2123 + Zz 


4.2. The Argand Plane. The complex number z = x + ty admits of a 
very simple geometric representation. We may consider z as a vector 
whose origin is the origin of the Euclidean zy plane of analytic geometry 
and whose terminus is the point P with abscissa x and ordinate y. The 
mapping of complex numbers in this manner yields the Argand z plane. 
Addition of complex numbers obeys the parallelogram law of addition of 
vectors. We call z the real part of z, written 2 = Rl z, and we call y 
the imaginary component of z, written 
y = Im z (see Fig. 4.1). 

The length of the vector represent- 
ing z = x + 1y is called the modulus 
of z, written mod z = |z| = (2? + y?)}. 
The argument of the complex number 
z is the angle between the real zx axis 
and the vector z, measured in the 
counterclockwise sense. The argu- 
ment of z is not single-valued, for if 6 = arg z, then 6 + 2rn is also the 
argument of z for any integer n. We define the principal value of arg z 
by the inequality —7x < Argz Sr. 


y P(x, y) 





Fia. 4.1 


Example 4.1. If z = 1 +7, then |z| = ./2, and Argz = 7/4, If z = —1, then 
lz] = 1, Argz2=-n. If z= —12, then |z2| = 1, Argz = —w/2, 
Example 4.2. If we use polar coordinates, we may write 


z=2z2+1y = r(cos 6 +7 sin 6) 


124 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


le] = r,argz = 0. Ifz: = ri(cos 6; +7 sin 61), 22 = 72(cos 62 + 7 sin 62), we leave it 
to the reader to show that 


2122 = ryr2[cos (6; + 62) +2 5in (6; + 62)] 


21 = mh [cos (0; — 62) + 1 sin (0; ee 62)] Te * 0 
Z2 =e 


~ 
We leave it to the reader to attach a simple geometric interpretation to multiplication 
and division of complex numbers. 
Example 4.3. The reader should verify that 


Ri (21 + Z2) = Rl 21 + Rl 22 
Im (2; + 22) = Im z; + Im 22 
|z1Z2] = [zal + [ze] 
Arg 2122 = arg 21 + arg z2 + 2an n an integer 
arg = = arg 2) — arg z2 + 2nn 
2 


—|z| S$ Rlz Ss |2| 
—|z|} Ss Imz S [el 


If z = x + iy, we define the complex conjugate of z by the equation 
Z=a2-— ty. The conjugate of a complex number is obtained by replac- 
ing t by —2z. Obviously 


2) + 22 = 2, + 22 


z+zZ=2Rilz 
z2—-2Z=2Imz 
z= 2? + y? = (el? 
From |z; + z2|? = (21 + z2)(21 + 22) we let the reader deduce that 
|z1 + 2e| S |z:| + |ze|, and from this that jz: — ze] 2 | |z:| — |ze| | for all 


21, 22. 

4.3. Simple Mappings. Henceforth the complex number z will stand 
always for the complex number z + ty. We now examine the complex 
number w = 2? = (x + ty)? = 2? — y? + Qzyz. It will be highly bene- 
ficial to construct a new Argand plane, called the w plane, with 


w=utw 


u and v Euclidean coordinates. The equation w = z? may be looked 
upon as a mapping of the complex numbers of the z plane into complex 
numbers of the w plane. The transformation w = z? maps the point P 
with coordinates (z, y) of the z plane into the point Q of the w plane whose 
coordinates are u = xr? — y?, v = 2ry. A curve in the zy plane will, in 
general, map into a curve in the wy plane. For example, the transforma- 
tion w = 2? maps the straight line z = t, y = mit, —~ <t< , into 
the straight-line segment u = (1 — m?’)t, v = 2mt,0 St < o. 

The hyperbolas x? — y? = constant = c map into the straight lines 
u=c. The hyperbolas ry = c map into the straight lines v = 2c. 


COMPLEX-VARIABLE THEORY 125 


Example 4.4. We now examine the transformation w = z + 1/z,z #0. Wehave 
z = r(cos 6 +728Iin @), w = u + 20, so that 


ut ww = r(cos @ + isin 6) ++ (cos @ — sin 6) 


and Uu= (- + *) cos 6 


r= (7-5) sin 6 
r 


From cos? @ + sin? 6 = 1 we have 


“2 vy? 


Gtifi Gs 2 eS 


The circles r = a ~ 1 map into the ellipses 


ue vy? 


@+tifae t @atjap =! 


of the uv plane. Into what curve does the circle r = 1 map? 


Problems 


1.Ifa+bh =c +d, a, b, c,d real, show that a = c,b = d. 

2. Show that 0-2 = 0 for all z by use of the distributive law and the definition of 
the zero element. 

8. From |z1 + 22|? = (2: + 22)(Z1 + 22) show that |z1 + ze] S |z:| + |zel. 

4. Show that (cos @ + 2 sin 6)" = cos (n@) +25s1n (n@), n an integer, This is a 
formula of De Moivre. Obtain an identity involving cos 46. Solve for the roots of 
2=1,z24=1, 26 =1, 

5. Examine the transformation w = z — 1/z in the manner of Example 4.4. 

6. Examine the transformations w = az,acomplex,w = z + b,bcomplex, w = 1/z. 
Examine the important bilinear transformation w = (az + b)/(cz +d), a, b, c, d 
complex. 


4.4. Definition of a Complex Function. Let Z = {z} be a set of com- 
plex numbers defined in some manner. Now assume that by some rule 
or set of rules we can set up a correspondence such that for every point z 
of Z there corresponds a unique complex number w, and let W = {w} 
be the totality of complex numbers obtained in this manner by exhausting 
the set Z. We thus have a mapping Z— W which defines a complex 
function of z over the set Z. The correspondence between the element z 
and the element w is usually written 


w = f(z) (4.1) 


It is customary to consider f(z) as the complex function of z. We can 
write f(z) = u(x, y) + i(2, y), since, given z, we are given zx and y, 
which in turn yield the real and imaginary parts of w, called u(z, y) and 
v(x, y), respectively. 


126 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Example 4.5. Let Z be the set of complex numbers z such that. |z| > 1. The 
correspondence z — w such that w = z/(|z| — 1) defines a complex function of z for the 
domain of definition of z. In this case 


x y 
; Sn ee a ) = 
u(r, y) Ua v(x, y Vege 


with x?+y? > 1 


Example 4.6. [et Z be the finite set of complex numbers z = 1 and z = 2. Let 
z = 1 correspond to w = 5, z = 2 correspond to w = 1+ 7. This mapping defines 
a complex function of z defined over the set Z. 

Example 4.7. Let Z be the set of real integers, and let z of Z correspond to the 
constant w = 7 for all z of Z. Note that in this case more than one element of Z 
corresponds to the same value w. A complex function can be a many-to-one mapping. 
Remember, however, that only one w corresponds to each z. This is what we mean 
by a single-valued function of z. 


4.6. Continuous Functions. We define continuity of a single-valued 
complex function in the following manner: Let the points z of Z be mapped 
into the points w of W, and consider the two Argand planes, the z plane 
and the w plane. We say that f(z) is continuous at z = 2o, zo in Z, if the 
following holds: Consider any circle C of nonzero radius with center at 
wy = f(zo). We must be able to determine a circle C’ of nonzero radius 
with center at zo such that every point z of Z interior to C’ maps into a 
point w of W interior to C. Analytically this means that, given any 
e > O (the radius of the circle C), there exists a 6 > 0 (the radius of the 
circle C’) such that 


If(z) — f(zo)| < 


for all z of Z such that |z — z.| < 6. This is the usual e, 6 definition of 
continuity of real-variable theory. If f(z) is continuous at every point 
z of Z, we say that f(z) is continuous over Z. 

If 2,7 = 1, 2,3, .. . , is an infinite sequence of Z tending to zo as a 
limit, Zo in Z, then the abeve definition of continuity implies that 


lim f(z) = f(zo) 


2—~ 20 


Let the reader verify this. 


Ezample 4.8. Let f(z) = zforallz. We easily note that this function is continuous 
for all 2 since we need only pick 6 = e¢ for every choice of « > 0. 

Example 4.9. Let f(z) and g(z) be defined over the same set Z = {z}, and assume 
that f(z) and g(z) are continuous at the point zo in Z. We now show that f(z) + g(z) 
is continuous at zo. Choose any e > 0, and consider e/2 > 0. Since f(z) is con- 
tinuous at Zo there exists a 5, > 0 such that |f(z) — f(zo)| < ¢/2 for |z — zo] < 61, zin 
Z. A similar statement holds for g(z), with 6: replacing 5,. The reader should be 
able to show that |(f(z) + g(z)) — (f(zo) + g(zo))| < € for |z — 20] < 8, z in Z, 5 the 
smaller of 8;, 33. 


COMPLEX-VARIABLE THEORY 127 


Example 4.10. Let the reader prove the following statements: Let f(z) and g(z) be 
defined over Z, and assume f(z) and g(z) continuous at z)in Z. Then 

1. f(z) — g(z) is continuous at Zo. 

2. f(z)g(z) is continuous at 2o. 

3. f(z)/g(z) is continuous at z> provided g(zo) # 0. 


Problems 

1. Prove the statements of Example 4.10. 

2. Show that f(z) = aoz* + aiz™} + +--+ + + Gn, 1 a positive integer, is continuous 
for allz,z #% to. 

8. Show that the f(z) of Examples 4.6 and 4.7 are continuous over their domain of 
definition. 

4. Let f(z) = (28 — 1)/(2 — 1), z #1, f(1) = 3. Show that f(z) 1s continuous at 
z= 1. 

6. Let f(z) be continuous over a closed and bounded set. Show that f(z) is uni- 
formly continuous (see Sec. 10.12 for the definition of uniform continuity). 

6. Let f(z) = 1/(1 — 2),z #1, f(1) = 5. Show that f(z) is discontinuous at z = 1. 

7. Show that f(z) = z sin (1/xz) + wz, x € 0, f(0) = 0, is continuous at the origin 
(0, 0). 


4.6. Differentiability. Let f(z) be defined for the set Z = {z}. Let 
zo be a point of Z, and let 21, 22,23, . . . ,2n, - . - be any infinite sequence 
of elements of Z which tend to 2) as a limit. None of the z,,7 = 1, 2, 3, 

. , 18 to be equal to 2. We say that f(z) is differentiable at zo if 


lim Ln) — Sl2o) (4.2) 


Zn— Zo Zn — 20 


exists independent of the sequential approach to z. We can state 
differentiability in an equivalent manner. f(z) is differentiable at 2, 
if there exists a constant, written f’(zo), so that for any e > 0 there exists 
a 6 > 0 such that 

F(z) we f (20) — f' (20) <€ 


~~ 29 


whenever |z — 20| < 6 for z in Z, z # 2p. 

In the cases we shall be interested in, zo will be an interior point of Z 
(see Sec. 10.7) for the definition of an interior point. In this case, (4.2) 
becomes 


Vg.) a Vin 1i20 + Az) — f(Zo) 
F (Zo) = = AS (4.3) 
independent of the approach to zero of Az. Let us consider 


f(z) = ula, y) + w(z, y) 


and investigate the conditions that will be imposed on u(z, y) and v(z, y) 


128 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
in order that 


lim 
Az—0 


f(z + Az) — f(z) 
Az 


shall exist independent of the approach to zero of Az. We have 


Az = Ax +12 Ay 
f(e + Az) = u(x + Az, y + Ay) + wa + Az, y + Ay) 
fiz + Az) — f(@) _ u@ + Ar, y + Ay) — ul, y) 
Az — Az +7 Ay 
re v(x + Az, y + Ay) — v(e, y) 
Ax +12 Ay 





so that 


(4.4) 


First we let Az— 0 by keeping y constant, that is, Az = Az, Ay = 0. 
Then (4.4) becomes 


f@ + Az) — f@) _ ule + Ax, y) — ule y) , oe + Ax, y) — vf, y) 








Az Ar Ax 
f(z + Az) — f) _ a Ov 
and an Az 7 Ox +t Ox 


provided ee and ~ exist. Now we let Az— 0 by keeping x constant, 
that is, Ar = 0, Az = 21 Ay. Then 


ies f(z + Az) — f(z) _ldw, a 








Az—0 Az 1 ay ay 
a SO ca Ot 
— by ay 
provided:= ay “ and i exist. Assuming differentiability of f(z) yields 
au, 2 _ or ja 
Ox dc Oy dy 
Ou Ou . 
so that = ay (4.5) 
au _ au 
Ox oy 


The Cauchy-Riemann conditions (4.5) are necessary if f(z) is to be 
differentiable. We have, however, neglected the infinity of other meth- 
ods by which Az may approach zero. But it will turn out that the 
Cauchy-Riemann equations (4.5) will be sufficient for differentiability 
provided we further assume that the partial derivatives of (4.5) are 
continuous. Let us proceed to prove this statement. The right-hand 


COMPLEX-VARIABLE THEORY 129 


side of (4.4) may be written 


u(x + Ax, y + Ay) — u(x, y + Ay) + uaz, y + Ay) — ula, y) 





Ax + a Ay 
a (e+ Ax, y + Ay) — rv, y + Ay) + v(@, y + Ay) — vf, y) 
Ar +17 1 Ay 
_ Ax (du/ax + &) + Ay (du/ay + &) 
Ax +7 Ay 
, Ax (d0/dx + &s) + Ay (dv/dy + &) 





Ar + 7 Ay 


where £), £2, 3, &4 tend to zero as Az > 0. We have applied the theorem 
of the mean of the differential calculus. The § — 0 if we assume con- 
tinuity of the partial derivatives. Making use of the Cauchy-Riemann 
equations yields 


fle + Az) ~ fle) _ ou, Oy | Aw + fedy + (fs An + fs dy) 
Az a Ou Ar +121Ay 


We leave it to the reader to show that 


b1 Aw + &2 Ay + (Es Ax + 4 Ay) 


lim = 0 
Az+1 Ay 0 Ax + a Ay 
Hence lim We Re) = <2) = 1) exists and equals = + ix independent of 
Az— 0 


the manner of approach to zero of Az. We have proved Theorem 4.1. 
THEOREM 4.1. f(z) = u + 1s differentiable if u and v satisfy the 
Cauchy-Riemann equations and if, furthermore, the four first partial 
derivatives of u and + with respect to z and y are continuous. 
The following example shows that we cannot, in general, discard the 
continuity of the partial derivatives. Let 


x1 + 2) — yl — 4) 








f(z) = ey z #0 f(0) = 
ee = 
Here u(x, y) = mee u(0, 0) = 0 
_oty = 
v(x, y) — a? + y? v(0, 0) e. 
Moreover oa = lim ee = lim= = 1 
OZ l00 20 x — 6 rot 
du Ov Oy _ _ 
Similarly ay. —1, aa 1, a 1, at 2 = 0, so that the Cauchy- 


Riemann equations hold at z = 0. However, f’(0) depends on the 


130 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
approach to the origin, for let y = mz, so that 
_ a1 +12) — rmi(1 — 2) 
IQ) = TF mt 
along y = mz, and 


£2) — $0) _ y+ 8) — ml — ae _ (1 +4) — ml - 9) 


pa z—-0 wo ltmy(l1+mi) x (1+m)(1+mi) 
which depends on m. Let the reader show that es is discontinuous at 


the origin by showing that lini” ~ does not exist. 
z—0 dx 
40 


DEFINITION 4.1. Let f(z) be differentiable at 2 = z). If, furthermore, 
there exists a circle with center at zo such that f(z) is differentiable at 
every interior point of that circle, we say that f(z) is regular, or analytic, 
at zo. Analyticity at a point is stronger than mere differentiability at 
that point (see Prob. 10 of this section). If f(z) is not analytic at zo, we 
say that zo is a singular point of f(z). 


Example 4.11. We show that f(z) = 2? is differentiable everywhere. Since 
S(z) = (we try)? = zr? — yy? + 2ryr, we have u = 2? — y?,v = 2ry. Hence 


or 7 Oy’ ay 
so that the Cauchy-Riemann equations poe ge a it is obvious that the partial 
derivatives are continuous. Thus f’(z) = — —+ is —— = 27 + 2yt = Q(z + ty) = 2z. 
We could have shown that f’(z) exists and iis : a applying (4.3). 


lie (z + Az)? — 2? as i 2z Az + (Az)? 


Ae—0 Az Az—0 Az mee 


Let the reader show that if f(z) = 2", n an integer, then f’(z) = nz*7), 
Example 4.12. The reader should verify that if f’(zo), g' (zo) exist, then 


(1) = t Ifle) + g(z) | = f' (zo) + g' (zo) 








(2) é pewter, = f(20)q" (20) + f'(20)g (20) 
Sz _ g(zo)f' (zo) — f(zo)g' (20) 
® alselien 7 [ae vn 
We assume that f(z) and g(z) have the same domain of definition. 
Problems 


1. If u and v satisfy (4.5) and if their second partials exist, show that V?u = Vv = Q, 
Also show that 
Ou Ou 


Bates 


COMPLEX-VARIABLE THEORY 131 


Give a geometric interpretation of this last equation. Remember that 
Ou. Ou. 
Vu = Oz 1+ oy J 


is @ vector normal to the curve u = constant. 
The fact that u and v must satisfy Laplace’s equation proves to be useful in the 
application of complex-variable theory to electricity and hydrodynamics. 
2. Let f(z) = u + w be differentiable, and assume wu is given such that V?u = 0. 
Show that o(z, y) = [ Pgs OU) oo i ¥ Gulte Y) dy. Let 
Oy 0 Ox 


f(z) = x — 3zy? + iz, y) 


be differentiable. Find v(z, y). Note that V2(z3 — 3zy?) = 0. 
8. Show that f(z) = x — zy is not differentiable. 
4. Show that f(z) = 2? — 3ry? + 21(3z%y — y*) is analytic everywhere. Then show 
that f(z) = 2. Iff(a@ + iy) = u(z, y) + wa, y), show that f(z) = u(z, 0) + rv(z, 0). 
§. Let sin z = sin z cosh y +2 .cosz sinh y. Show that sin z is analytic every- 


; oe ; ‘ ; d sin 
where. Is this definition of sin z consistent for z real? Define cos z = ——) and 


show that sin? z + cos? z = 1. 
6. If f’(z) = 0 for all z, show that f(z) = constant. 
7. Prove the statements in Example 4.12. 


nm n 
8. If f(z) = anz* +anic™ 1 +--+ ta= » a,z*, show that f’(z) = > kayz*—?, 
k=0 k=l 
9. If f(z) = > a,z" converges for 0 S |z| < R, show that f’(z) = » na,z"—! for 
n=() nwa] 
Os lel <R 


10. Show that f(z) = x?y? 1s differentiable only at the origin. 
11. Assume f’(a) and g’(a) exist, g’(a) # 0. If f(a) = g(a) = 0, show that 


lim f(z) _ f(a) 


za g(z)  g'(a)' 


4.7. The Definite Integral. The reader is urged to read first those 
sections of Chap. 10 concerning the uniform continuity of a continuous 
function, rectifiable Jordan curves, 
the Riemann integral, and Cauchy 
sequences. Let I be a rectifiable 
Jordan curve! joining the points 
z=a,z=f8. Let f(z) be acontinu- 
ous complex function defined on I. 
We are not concerned with the defi- 
nition of f(z) elsewhere. Neither do 
we introduce the differentiability of 
f(z). The Riemann integral of f(z) 
over I is defined as follows: Subdivide [T mto n parts in any manner 
whatsoever. Call the points of the subdivision a@ = 2p, 21, 22, . . - , k, 





Fia. 4.2 


1 Called a simple curve. 


132 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Zkt+1, - + + » &n = B (see Fig. 4.2). On each of the paths 202), 2122, . . . , 
Ze2k41, » + +» , 2n-12, Choose a point &, f, ... , &, Feri, ..- , én, and 
form the partial sum 
Jn = Y flee — e-1) (4.6) 
k=1 


If lim J, = J exists and is unique independent of the choice of the 


£ and independent of the method of subdividing [ provided only that, 
as n tends to infinity, the maximum of the arc lengths from z;,_; to zz, 
k= 1,2, ...,n, tends to zero, we say that f(z) is Riemann-integrable 
over I’ and write 


J = [ f(z) dz over [ 
~— B = 
= en f(z) dz i. f(z) dz (4.7) 


This definition agrees with the definition of the Riemann integral of real- 
variable theory if I is a section of the real axis, T: a S$ x S 8B, and if 
f(z) is a real function of x. 

We now show that, if f(z) is continuous on the rectifiable Jordan are 
given by x = f(t), y = e(t), to St S ty, then the Riemann integral of f(z) 


over I exists. The proof proceeds as follows: Let us first look at any 
are of I joining z = a toz = b and consider 


S = f(é)(b — a) 


where é is any point on the are joining z = a andz = b. A further sub- 
division of this are into zo = @, 21, 22, . . . ,2n = b yields the partial sum 
defined as in (4.6), 


Sa = ze f( Ee) (2 — Ze—1) 
Now S = f(&(6b — a) = f(&) ) (Z, — 2-1) = » L(&) (Ze —~ 2e~1) 
k=] k=1 
so that S—Si= ) (8) — fe) ee — 2-1) 
k=] 


If furthermore the maximum variation of f(z) on the arc joining z = a, 
z = bisa, then 


IS — S,| S Z If(€) — f(&)| lek — ze-1| So Z zx — 2%al Sob (4.8) 


COMPLEX-VARIABLE THEORY 133 


where 1 is the length of arc from z = atoz = 6b. Why is 
y |@. — 2%] S$ L? 
k=1 


This result will be important in what follows. Now we use the property 
that f(z) is uniformly continuous on T. Choose a subdivision of T so 
that the maximum variation of f(z) on any segmental are of the sub- 
division is less than 4, and obtain J, [see (4.6)]._ Now impose a finer sub- 
division on the previous subdivision such that the maximum variation 
of f(z) on any segmental arc of the new subdivision is less than 1/2?, and 
form a J2 for this subdivision. Continue this process. For J, the 
subdivisions are so fine that the maximum variation of f(z) on any seg- 
mental are is less than 1/2". Moreover the maximum subdivision tends 
to zero in size. We obtain the sequence of complex numbers 


Tee Miia as A aoe, c (4.9) 


We now show that lim /, exists. Choose any e > 0. We can find an 
nN © 

integer no such that 1/2" < «/lL for n 2 no, where L is the length of arc 

of f. Now for m 2 n 2 no we have 


I 
= ey: 
Sa Saal = 5a < € 





using the result of (4.8). Hence the sequence (4.9) 1s a Cauchy sequence 
and 
lim J, =Jd 
We must now show that J is the same limit for any other choice of the 
£,. For the new choice of the & we obtain the sequence 


/ / 4 
1) 2) * e e 9 Ses . ° e 


and by exactly the same reasoning as above we have 


lim J, = J’ 
But |J. — Ji| < (1/2")L, and we leave it to the reader to show that this 
implies J = J’. 

The final step is to show that the same limit J occurs for any other 
method of subdividmg provided the maximum length of the subdivisions 
tends to zero. For any other sequence Ky, Ko, ...,Kn,.. . of partial 
sums of the form (4.6) we can superimpose the subdivisions which yield 


134 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
K,, on the subdivisions of J, of (4.9). Then 


L 
In — Kal <a 


from (4.8). We leave it to the reader to show that lim K, = lim Jn. 


r—> © n— 0 


This concludes the proof. The reader is referred to that excellent text 
by Knopp, ‘‘ Theory of Functions,’’ Dover Publications, 1945, for a much 
clearer expository of the Riemann integral. 

For the reader who has trouble in understanding what has been 
attempted let us note that if f(z) = u(z, y) + (x, y) and if T is given by 
x= x(t), y = y(t), to St S ti, then dz = dr + idy, 


f(z) dz = u(z, y) dx — v(x, y) dy + au(z, y) dy + v(z, y) dz] 


so that it would be logical to define 


[.f@) de = f ule, y) dx — v(@, y) dy 
+i| fue, vy) dy + 0@, y) dr| 4.10) 


where i u(x, y) dz, ete., are the ordinary line integrals of real-variable 


theory. It is not very difficult to show that this definition and existence 
of the line integral agrees with that discussed above for a continuous 
f(z) and hence a continuous u(z, y), v(2, - In particular if the Jordan 
curve is regular in the sense that a and “4 4 , are continuous, then (4.10) 


becomes 


[se a = [| wero, vio SP — wteto, coy 2 | a 


sexy [ : [eee yoy ti + v(e(t), u() 20 | ae (4.11) 





Example 4.13. We evaluate r f(z) dz, where f(z) = z and the curve IP is given as 


the straight line joining the origin and the point (3, 4). I may be written zr = 31, 
y= 40S tS 1. Along’ 


f(z) =z=at+ry = (38 + 40)t 


Moreover the vector z to the curve is z = (3 + 4:)t so that dz = (3 + 42) di and 
f(z) dz =m (3 + 42) dt. 


f(z) dz = f (3 + 42)% dt = $(3 + 4:1)? 
r 0 


COMPLEX-VARIABLE THEORY 135 


We could save ourselves all this work if we knew that 
B 
i zdz = +(8? — a) 


for any simple path T from z =atoz= 8. Let the reader show that this is true. 
Hint: subdivide f into @ = 20, 21, 22, . . . , 2. = 6, and consider 


n 


Jn = > Ze—-1(2e — 2-1) 


k=] 
n 


’ 
a = > Zi(Zk — 2k~1) 
k=] 


Show that J, + J, = B? — a, and letn— &. 
In this example we notice that | re dz depends only on the end points of T and not 
on I itself. The significance of this will become apparent in the next section. 
Example 4.14. We evaluate f f(z) dz for f(z) = x — ty with IT the straight line 
from (0,0) to (1,1). Herex =t,y = 4,0 St Sl,z = t+ it,dz = (1 +12) dtso that 


1 . . 
[ft dz = i (l—71 + 7)tdt = 1 


For the same f(z) let us take I as the sum of the two straight-line segments, one from 
(0, 0) to (1, 0), the second from (1, 0) to (1,1). For the first path r = t,y = 0,0 st 


< 1,sothat dz = dt and f(z) = t and 7 f(z) dz = %. Along the second path z = 1, 
1 


dx = 0,y =i, dz =idt,0 st S1, and f(z) = 1 — 2 80 that 


1 
[,f@ dz = I (1 — i)idt =Z+i 
Hence 


[fe dz = J. f(z) dz + [fe dz=1+1 


Thus, for f(z) = 2 — ity, the line integral is not independent of the path. One may 
guess that the nonanalyticity of f(z) = x — 1y may be the answer. The next section 
will verify this fact. 

Example 4.15. The function f(z) = 1/z is continuous and analytic everywhere 
except at the origin. Let us compute g f(z) dz, where I is a circle of radius a with 
center at the origin. It is easy to see that z = a(cos 6 +72 s1n 6),0 S 6 S 2x, repre- 
sents the circle. Hence dz = a(— sin @ +2 cos 6) d6, and 


— sin 6 + 2-cos 6 


2x : ; - 
tie =f (— sin 6 + 7 cos @)(cos 6 — 7 sin 6) dé 


Qe 
=f fue Bei 
0 


Even though f(z) is analytic on T, g¢ f(z) dz ¥ 0, in this case. We shall see that the 
singularity of f(z) at the interior point z = 0 accounts for this fact. Notice that the 
value of the integral is not dependent on the radius of r. Indeed, it will be shown 


136 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


that for any closed rectifiable path I, with the origin in its interior, one has 


f ae Dg 
T 2 
Problems 
1. Show that i tice (ae wey 
- _ 0 ifm ~ —1 
2. Show that £(z — 20)"dz = Oni fm = —] 


m a positive or negative integer, if the integration 1s performed around the circle T 
with radius a and center z. The integration is performed in the counterclockwise 
sense. 


8. Define I, |f(z)| |dz|, and show that 





fi serae| sf iseliae 
4. Show that [’ f(z dz= —- [? f(z) dz. 
(YT) (T) 


6. If |f(z)| s M along lf and 4 is the length of I’, show that 


| fif@ dz 


6. Why is it that ie cf(z) dz =c f, f(z) dz for c constant? 
7. Why is it that [fie dz + [feo dz = i [fi(z) + fe(z)| dz? 


8. Evaluate - (z + {z|) dz, where I is the straight line from (0, 0) to (1, 1). Do 


< LM 





the same for the other path discussed in Example 4.14. 
2 
9. Evaluate l °e along the path f consisting of I; and ls, where I, is the straight 


‘ine from (1, 0) to ({zo|, 0) and I. is the arc of the circle from (|zo|, 0) to 29 with center 
ut the origin. Is the value of the integral single-valued? 


4.8. Cauchy’s Integral Theorem. The fundamental theorem of com- 
plex-variable theory is due to Cauchy. There are various forms of this 
theorem. We present now a proof 

D(x, y) C(x, y) of one form. ° ; 
Let S be a simply connected 
open region such that the partial 
derivatives of u(x, y) and (2, y) 
are continuous and satisfy the 


Cauchy-Riemann equations, 
A (tp, ¥p) B(x, ¥,) ord see 


Fig. 4.3 f(z) = ula, y) + wa, y) 


atevery pointofS. InSec. 4.6 we saw that these conditions were sufficient 
for f(z) to be analytic in S. We shall show that, if Tis any closed simple 


COMPLEX-VARIABLE THEORY 137 


curve inside S, then 


ff) dz = 0 (4.12) 


The statement concerning (4.12) is Cauchy’s integral theorem. We 
prove first that g f(z) dz = 0 for the rectangle ( (see Fig. 4.3) contained 
entirely in S. 

We can obtain a single-valued complex function of the two real var- 
lables x and y by defining 


F(a, y) = [* fe + iyo) dt +a f” fe + wat 


F(a, y) is the sum of the integrals of f(z) along the straight lines AB and 
BC. Similarly G(z, y) is obtained by integrating f(z) along AD and DC. 
The integration of f(z) around the rectangle C in the counterclockwise 
sense 18 


$f @) dz = F(x, y) — G(x, y) 
where G(x, y) = 7 [? flay + at) dt + fe f(t + ty) dt 


If we can show that F(z, y) = G(a, y), then f Sle) dz = 0 
Let the reader show that 


= = f(x + tyo) + if Me * ul (4.18) 
We need 


£ ‘ f(t + tyo) dt = f(x + ty0) 
a [* _ [* of(x + it) 


in order to obtain (4.13). These statements are proved in Chap. 10 for 
f real. The student need only write f = u + 7 and apply the theorems 


of real-variable theory to u and v separately. The continuity of of 


is used to perform the differentiation underneath the integral. Also 
we have 


3 = f(z + ty) = f(z) 
oC = fle + iu) = @) (4.14) 


= if(xo + ty) +f of ut Ae dt 


138 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


As yet, we have not made use of the Cauchy-Riemann equations. Thus 


; Ou a  _ ou _ _,( du /Ov\ . Of 
PO) = 3: = ap ~ By -i(% a i) ay 
Hence = = f(r + tye) + i ae +9 £1) ay = fet iy) = (4.18) 

— oF ~aG : ; 
Similarly ay = 3y' and hence dF = dG, F =G+c. Since F =G=0 


at (Xo, Yo), the constant cis zero. This proves that f, f(z) dz = 0. 
We now consider f, f(z) dz, where To is any closed rectifiable Jordon 


curve lying inside the rectangle ABCD. Parts of the curve may be 
segments of the sides of the rectangle (see Fig. 4.4). Let P(z, y) be any 
point on To, and define F(z, y) at P in exactly the same manner in which 
F was defined at C(x, y). From (4.14) and (4.15) 


dF = —— r da + 5; Edy = f(z) dx + af(z) dy 


i) (dx + idy) = f(z) dz 
Hence =f. fe) dz =f ar =G dl +ig. dV =0 


where F = U+7V. Certainly gdl’ = £dV = 0 
The final part of the proof uses the following reasoning: Let [ be any 
closed simple curvein S. I’isa closed set of points. Now S was assumed 
to be an open set. If we adjoin to S 
its boundary points, we obtain S, a 
closed set, consisting of S and the 
boundary of S, say, lf. There will be 
a minimum distance between I and 
[ not equal to zero. We can there- 
So fore impose a fine enough mesh on S 
(the mesh consists of rectangles) so 
that the rectangles which contain the 
points of I will be entirely in S. 
The integration over all rectangles interior to TI plus the integration 
of f(z) over those boundaries which include T vanish from the above 
results. However, all integrations over the rectangles interior to [ 
vanish in pairs, leaving 


A (Xp, Y,) 
Fia. 4.4 


ff) dz = 0 (4.16) 


The condition that the partial derivatives of w and v be continuous 
can actually be omitted. The proof that p ; f(z) dz = 0 for the rectangle 


COMPLEX-VARIABLE THEORY 139 
C under the condition that f(z) be analytic in S can be shown as follows: 


First assume f, f(z) dz #0. Subdivide the region FR (C and its interior), 
into four new regions (see Fig. 4.5) by halving the sides of the rectangles. 
Now f, f(z) dz equals the sum of the integrals of {(z) over the boundaries 


of the four new regions since the internal integrations cancel each other in 
pairs. Hence at least one of the four integrals does not vanish. We 


choose that boundary C, for which d, f(z) dz| has the largest value. 


Again we subdivide, choose C2, and continue this process indefinitely. We 
obtain a sequence of regions R, Ri, Re, . .. , Ra, . . . , with boundaries 
C7, C1, Co, Cs, ... ,Cn, . . . such that 
f. f(z) dz #0. Theregions R,,n = 1, 
2, ... are closed and bounded sets, 
and the diameter of R, tends to zero 
asn—o. From the theorem of nested 
sets (see Chap. 10) there exists a unique 
point P which belongs to every R,, 
n=1,2,3,.... Let P be the point zo. Obviously zo is interior to R, 
or zo lies on C. Hence f(z) is differentiable at z = 20 so that 


f(z) = f (zo) + I’ (20) (2 — zo) + e(Z, 20) (Zz ae Zo) 


where e(z, Zo) tends to zero as z tends to zo. Hence, given any eo > 0, we 
can pick a region R, with boundary (’, such that |e(z, zo)| < eo for all z 
on C,. Now 


|p, fe) de} S 4/6, fe) de] s 4h, fe dels > sa 
But, f@) dz = h,, fle) de + f. J (e0)(@ — 20) de 

+ [leo 2)( — 20) dz 
so that f, fle) dz = [, cleo, 2)( — 20) de 


Remember that ¢ dz = 0, $zdz = 0. For any positive e9 we choose n 
sufficiently large so that |e(z, zo)| < €o for allzonC,. If 1, is the length 
of the diagonal of the rectangle C,, then 


I... f(z) dz < & $.. lz — zo |dz| S e041? = 4ep @) 


since l, = L/2", where L is the length of the diagonal of C. Hence 





Fia 4.5 


f,. fi) 








2 


ID. f(z) de| < Arey = deo]? 





140 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Since eo can be chosen arbitrarily small, the constant f, f(z) dz must be 
zero. Q.E.D. The proof of Cauchy’s theorem follows then in exactly 
the same manner as demonstrated above. This proof is due to Bliss in 
the American Mathematical Society Colloquium Publications, vol. 16. 

A very strong statement of Cauchy’s theorem is as follows: Let T 
be a simple closed path such that f(z) is analytic in the interior R of T 
and such that f(z) is continuous on T. Then 

S f(z) dz = 0 
The proof is not trivial and is omitted here. Continuity of f(z) on I in 
this case means lim f(z) = /(¢), ¢ on T, z on T or in R. 
26 


We now state some immediate consequences of Cauchy’s integral 


theorem: 
A. Let f(z) be analytic in a simply connected region R. For z = a, 


2 = B, in R, 
[2s d 
is independent of the simple path chosen from z = a to z = B provided 


the path lies entirely in FR. Let the reader verify this statement. 
B. Let f(z) be analytic in a region FR bounded by two simple closed 


paths C,, Ce. (see Fig. 4.6). Then f, f(z) dz = f, f(z) dz provided 
1 2 
both integrations are done in a clockwise sense or a counterclockwise 


Cy 





Cy 
Fia. 4.6 Fic 4.7 


sense. The proof is as follows: Construct the paths AB and EF (see 
Fig. 4.7). From Cauchy’s theorem 


f(z) dz = 0 f(z) dz =0 


Drees re 
Adding yields 


on 1) de +h, fe) dz = 0 
and Pov f(z) dz = Pow f(z) dz 


COMPLEX-VARIABLE THEORY 141 


Notice that nothing need be known about f(z) outside of C; or inside of 
Ce. We have assumed f(z) continuous on C, and C%. 
C. For Fig. 4.8 let the reader show that 


fe) dz = f., f(z) dz + f., f(z) dz 


f(z) is analytic inside I) and outside [, and I. and is continuous on 
To, Tu, Pe. Generalize this state- 
ment for the curves T,1,,T2,..., 
Tyee 
D. Let f(z) be analytic inside a 
simply connected open region R, 
and consider 


Fe) = [Fro at 


where 2p) and z arein R. The path 
of integration is omitted since the 
integral is independent of the path (see A). Hence 


f(t) dt + [ * 1G) a= f 


We choose the straight-line path from z to z + Az as the path of integra- 
tion. Thent = z+4yAz,0 S » S 1, along this path, dt = dy Az, and 


Peta) FO 2 [ye + wae) dn 
2 0 





Kia. 4.8 


z+Az z+Az 


Fe + Az) — F@) = f(t) dt 


0 


Since f(z) 1s analytic at z, 


lim F(z DE ne = F(z) ee f'(z) 


and fle + udAz) = f(z) + wu Azf'(z) + ew Az (4.16) 


where |e| > 0 as Az > 0. Hence 


EF 82) —N) & H(2) [aut acre) [udu tae fonds 
Zz 0 1) 0 


It is obvious that 
lim F(z + Az) — F(z) _ f(2) 
Az—0 Az 
so that F(z) is analytic in R. 
E. Let f(z) be continuous inside a simply connected open region R, 
and assume 


F@) = [*s@at 


142 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


is independent of the path from 2p to all z, the path lying entirely in R. 
We show that F(z) is analyticin R. The variable ¢ is simply the complex 
variable of integration. The reader can prove this statement easily 
enough by proceeding as in D. Continuity implies that 


fle + uw dz) = f(z) +9 (4.16’") 


where 7 ~ 0 as Az— 0. In the proof of D it was not necessary to use 
the analyticity of f(z) twice. We could have used (4.16’’) in place of 
(4.16’). 

F. The fundamental theorem of the integral calculus applies equally 
well to the theory of complex variables. Assuming the conditions stated 
in D yields 


F@) = ['sd  F'@) =f@) 


Let G(z) be any function such that G’(z) = f(z). Then 
£ F®@ — Ge) =0 
a @ gz = 

and F(z) = G(z) — C. Hence G(z) — C = [ * F(t) dt, 


Coy 6s ii * f(t) dt = 0 
so that 
G(z) — G(zo) = f " f(t) dt 


8 
As a simple example, | 2? dz = 3(6® — a®) since £ (32°) = 27. 


Problems 
dz 


2 20 


1. Show that p = 271 for any simple closed path I, zo in the interior of Ir. 


Show that f a - = Of zp is exterior to I. 
— £0 


é dz : ; : 
2. Evaluate p ae for any simple closed curve I enclosing the circler = 1. Hint: 


Use C above, and write 1/(z? — z) = 1/(z — 1) — 1/z. 

8. Construct a function f(z) such that gf(z) dz = 0 for all simple closed paths, f(z) 
not analytic. Does this contradict Cauchy's theorem? 

4. Let f(z, t) be a complex function of the complex variable z and the real variable ¢. 


Of (2, f % 
ue analytic in z for é) S ¢ S$ 4, and consider 





Assume f(z, ¢) and 


F(t) = [ fle, t) de 
Gi) = [? UCD ae 


COMPLEX-VARIABLE THEORY 143 


If, furthermore, a (z, ¢) is continuous in z and ¢t, show that 


/ *G(u) du = F(t) — F(t) 


i _ [4 oft, t) 
so that a |, 1 nat =f ap dz 


Qa 


5. Let f(z) and g(z) be analytic in a simply connected region R. From 


4 (fevule)) = f@)g'@) + F'@a) 
show that 
B B 
[Prew@ dz = $018) - feroe) - [" g@r'@ ae 


The path of integration from z = a to z = 6 lies in R. 


4.9. Cauchy’s Integral Formula. A truly fundamental consequence of 
Cauchy’s integral theorem of Sec. 4.8 is the following formula due to 
Cauchy: Let f(z) be analytic in the simply connected open region R, and 
let T be a simple closed curve in R. Then 





(Gag, 222 (4.17) 


2rmJrz—a 
if zg = a is an interior point of T. The sense of integration around T 
is such that as we move around I the region containing z = a lies to our 


left. The proof is as follows: From Sec. 4.8B we can replace the curve of 
integration [ by any circle lo with center at z = a, Io interior to I. 


then Lg fod. 1 § Meade 
2riJTz—a 2x1 JT%2—a 
1 [*f@b(— sin 6 + 7 cos 0) dé 





~ Qrt Jo b(cos 6 + 7 sin 6) 
2nr 
s = fla + d(cos 6 + i sin 6)] db (4.17") 


since z = a + b(cos @ +7 sin 6), 0 S 6 S 2x, is the equation of To, b 
the radius of Io. Since f(z) is continuous at z = a, we have 


fla + b(cos @ + 7 sin 6)] = f(a) + 7 


where 7 > 0 asb— 0. Hence 


ie fla + b(cos @ + isin 6)] dé = 2xf(a) + I, *" dé 


2r 
and lim Z I, fla + b(cos 6 + 7 sin 6)] dé = f(a) 


b+0 27 


144 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Since the left-hand side of (4.17’) is independent of b, (4.17) must result. 
We can now observe one important consequence of analyticity. To 
evaluate the right-hand side of (4.17), one needs only to know the value 
of f(z) on the boundary Tf. After the integration is performed, the value 
of f(z) is known in the interior of ©. Thus, if f(z) is known to be analytic 
inside I, if f(z) is continuous on I so that Cauchy’s theorem holds, then 
the values of f(z) interior to T can be determined if we know only the 
values of f(z) on I. Analyticity is, indeed, a powerful condition. 

Since f’(a) is known to exist, we may hope that f’(a) can be obtained 
from (4.17) by differentiating underneath the integral. If this were 
possible, we would obtain 





6) = sabre ay 4.18) 
Let us prove that (4.18) is correct. We have 
sl f(z) dz 
LOE) oF Sa em 
fia +h) — f(a) = at eee Ps A |10 dz 
fla+h)—f@) _ 1. fea 
h Ini J (zg — a)(ze —a — h) 


f(a + M) = rh) — fla) 


Since f’(a) = lim » we need only show that 





, fe) dz f(z) dz 
ae J (g—ay)(ze—a— h) = 9 (2 — a)? (4.19) 
to obtain (4.18). Consider 
f(z) de - f JO# ong f@)de 
(g —a)(z —a—h) (2 — a)? (2g — a)*(2 —a — h) 
ome | f(z) 
For hf sufficiently small it is easy to see that | C=C ao is 


uniformly bounded for all z on T, so that as h — 0 (4.19) holds, which in 
turn yields (4.18). By mathematical induction let the reader show that 


f(a) = a nf ¢ oo n=0,1,2,3,... (4.20) 


Equation (4.20) embraces Cauchy’s integral formula for f(a) and all its 
derivatives. It is important to note that analyticity of f(z) is strong 
enough to guarantee the existence of all derivatives of f(z). In real- 
variable theory the existence of f’(z) in a neighborhood of z = a in no 
way yields any information about the existence of further derivatives of 
f(z) at r = a. 


COMPLEX-VARIABLE THEORY 145 


Problems 


1. Let f(z) be analytic within a circle C of radius R and center a, f(z) continuous on 
C. Let M be an upper bound of |f(z)| on C. Show that 


! 
yo@| se 





2. If f(z) is analytic for all z it is said to be an entire function. Use the results of 
Prob. 1 to show that an entire bounded function must be a constant. f(z) is said to be 
bounded if a constant Af exists such that |f(z)| < A/ for all z. Hence prove the first 
theorem of Liouville that a nonconstant entire function assumes arbitrarily large 
values (in absolute value) outside every circle with center at z = 0. 

3. Consider the polynomial 


F(z) = ac2" + aya") 4+ az™? + + + + +n 


z || 
RRO TP 


Hence show that for any positive Mf there exists an R such that, for |z| > R, |f(z)| > M. 

4. Let f(z) be the polynomial of Prob. 3. Assume f(z) # 0 for all z, and consider 
g(z) = 1/f(z). Why 1s g(z) an entire function? Use the result of Prob. 3 to show 
that g(z) is a bounded entire function. Since g(z) # constant, use the result of Prob. 
2 to deduce that a Zo exists such that f(zo) = 0. This 1s the fundamental theorem of 
algebra (Gauss). Deduce that f(z) has n zeros. 

5. Prove Morera’s theorem: If f(z) 1s continuous in a simply connected region R, 
and if gf(z) dz = 0 for every simple closed path in R, then f(z) is analytic in R. 

6. Prove (4.20) for n = 2. 

7. Let ¢,(z), n = 1, 2, 3, . . . , be a sequence of functions analytic inside and on 
the simple closed curve [T which converges uniformly to ¢(z). Show that ¢(z) is 
analytic inside I. 


Show that 
If(z)| 2 


MV 
a 
3 
“= 
fo) 
2 
S 
= 


4.10. Taylor’s Expansion. In real-variable theory certain functions 
could be written as infinite series, called Taylor’s series or expansion. 
For example, 


oo 





2 n apn 
pees ae ae Se hie se a A hs ee ) 
2! n! n! 
n = () 
ay ee a Ge 
nee eee a as T Gat! 
a ( ae Ljy"gen tt 
7 (Qn + 1)! 
n=O 
= _ x? xé ae (—1)™—ly" 
In(l +2) =2 a a ee ke 
= Gal 
= n 
nz) 


The function f(z) = e~”**, x ~ 0, f(0) = 0, is such that 
f(0) = f’(0) = f"(0) = --- =f™(0) =-+: =0 


146 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


However, this function has no Taylor-series expansion about x = 0. 
We shall find, however, that if f(z) is analytic at z = a, a Taylor-series 
expansion will exist for f(z) at z = a. Remember that analyticity of f(z) 
at z = a means differentiability of f(z) for all points in some neighborhood 
of zg =a. Before proving this result the student should review the 
definitions and theorems involving infinite series (see Chap. 10). The 
definitions of convergence and uniform convergence of series, and the 
theorems regarding term-by-term integration and differentiation of a 
series, apply equally well for an infinite series whose terms are complex. 

We now proceed to the development of Taylor’s theorem. Let f(z) be 
a function analytic in an open region R, and let zo be an interior point 


3 


of Rk. There exists a unique power series » a,(z — Zo)" such that 


n=O 
a 


fe) =) ane — 20)" 
n=0 (4.21) 


1 
-— — f(n) 
Gn = — f™ (20) 


for all z in some neighborhood of z = zo. The series (4.21) for f(z) con- 
verges to f(z) for all z inside a circle C 
surrounding z = Zo, and the circum- 
ference of C’ contains those singu- 
larities of f(z) which are closest to 
z= 2. A point z; is said to be a 
singular point of f(z) if f(z) is not 
analytic at z = 2. 

We proceed to the proof. LetC be 
the circle mentioned above. C has 
the property that there is at least 
one point on C' at which f(z) is not 
analytic. Moreover f(z) is analytic 
at every interior point of C. Now 
let z be any interior point of C, and 
construct a circle I with center Zo 
containing z in its interior, I interior to C' (see Fig. 4.9). 

Let ¢ be any — on I. From Cauchy’s integral formula 





Fig. 4.9 


FE) dg 
f(z) = cease 
14 fd 
~ Qei Ir (€ — zo) — (2 — 20) 


_. f(t) at _ 
on f C— zl — @ —2)/t — 20) ee) 


COMPLEX-VARIABLE THEORY 147 


< K < 1 (see Fig. 4.9), we have 


a) 
1 — (2 — 20)/(F — 20) oy [ — Zo 


Equation (4.22) becomes 


e Z 
Since 


— 20 
¢- 














sof, ae ae 20)! f(E) a¢ 


f@) = ce 
=() 


The uniform convergence of the series ae ¢ the variable of 
— “ft 


(¢ 
=0 


integration, z) and z fixed, enables us to interchange integration and 


summation. Hence 
» oe f(© dg 
Oni ee ay) (¢ — 29)"*! 


= a,(2 _ 20)” 


f(z) 


n=0 
ie nee (2 — 


n=O 


sie IO oY tn 
where a, = Ont (€ — 20)" = ey Sec (4.20) Q).E.D. 


To show uniqueness, assume 


eo 


f(z) = » an(Z — 20)" = y b,(z — 20)" 


n=O n=(0 


for all z inside (. For z = zo we have ap = bo, so that 


eo 


> Gn(Z — Zo)" = y b,(z — 20)” (4.23) 


n=1 n=l] 


for all z inside C. But (4.23) implies 


(2 — 20) > An(z — 20)"—! = (2 — 20) » b, (2 — 20)"7} 


n=l n=] 


148 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


for all z inside C, which in turn implies 


Z An(z — 20)" 1 = 2 bn(2 — 20)"-! 


for all z inside C with the possible exception of z = 2p. 


3 


oe 
Hence lim an(z — 2o)"~'! = lim > b,(z — 2z9)"7} 
n=] 


2 20) z-> 20 
n=] 


so that a, = b). By mathematical induction the reader can show that 
G2] bet |) 12s a % ELD: 


Ezample 4.16. The function /(z) defined by 
f(z) = e* cos y 4 te* sin y 
is easily seen to be analytic everywhere. We have 


Ou 
Ox 


f'(z) = 


Ov : 
+1 =e7cosy + te7™siny = f(z) 


so that by mathematical induction f/ (z) = f(z) and f((O) = f() = 1. Hence 


oo 


f(z) = > <= e™(cos y +7 981n y) 


n! 
n=Q 


We define f(z) to be e* = ete”. We note that, if zis real, f(z) reduces to e*. By direct 
multiplication of the series representing e? and e# it can be shown that ee = e?*41, 
y An easier wavy is the following: Let the reader 


d 
show that ee eae for any constant 
a. Then 


5 (e“e??) = —ete? = + ee? = 0 

so that e’e’* 1s independent of z. Let 
z = (0, and we have e‘e*-* = e*. Now let 
a =z +2180 that efe = e**1. From this 
result we havee’ = cos y +2 sin y, y real. 
Let the reader show that this result of 
Kuler’s also holds for y complex. 

Example 4.17. Let us define Ln z as fol- 
lows. Let z be any complex number other 
than z = 0, z = r(cos 6 +7sin 0) = ret? from Example 4.16, —r <@S 7. We 
ze 

t 
the negative z axis (see Fig. 4.10). Wecan replace I by the path from z = ltoz =r 
followed by the arc of the circle with radius r and center at z = 0 until we reach z. 





Fia. 4.10 


evaluate Ln z = along any curve I not passing through the origin or crossing 


COMPLEX-VARIABLE THEORY 149 





This yields 
6 tre’? 
Ln z = [eri uate 
} 2 Oo re’ 
=Inr +10 
= In |z| +2Argz 
ae dIinz 1 
Ln z is single-valued and analytic everywhere except atz = 0, since a aa For z 
real and positive, Ln z becomes the ordinary log. xz. It is easy to show that 
2 P n ses ! 
@inz_ 1 set Lnz _ (n — 1) (-1)"1 
dz? gr dz” an 
go 2 = 3 
so that Ln z= (2-1) - i 5 + et) vo 


Why does the series expansion for Ln z converge only for |z — 1| < 1? 
If we had not imposed the condition —x < 6 S x, then Ln z would not have been 
single-valued. Let the reader show that for this case 


z dt : 
In z -f a = In |z| + 2(Arg z + 277) 


where n is the number of times the path of integration I encircles the origin. If the 
integration 1s performed in a clockwise fashion, n is negative; otherwise n is a positive 
integer. Let the reader show that 


Ln (-1) = mi 
In 2:22 = In 2; + In 29 + 2rin 
Ln mire = Ln a1 + Ln zp a >0,22 >0 


Ln z is called the principal value of In z We imagine a cut exists along the negative 
x axis which forbids us from crossing the negative z axis. We call the point z = 0a 
branch point of In z. If we wish to pass from Ln z to In z, we need only imagine that 
as we approach the cut from the top half of the z plane we have the ability to slide 
under the cut into a new z plane. In this new Riemann surface, or sheet, we have 


In® zg = Ln z + 271 

If we now swing around the origin and slide into a new surface, we have 
In‘? z = Ln z + 4n2 

This process can be extended indefinitely. On each Riemann surface 


In® zg = Ln z + 2rki 


is a single-valued function. 

Example 4.18. The function f(z) = 1/(1 — 2), 2 ¥ 1, f(1) defined in any way we 
please, has for its only singularity the point z = 1. The Taylor-series expansion of 
f(z) about z = 0 must converge for |z| <1. Indeed 


fle) = > e (4.24) 


n=(Q) 


converges for |z} < 1 and can be used to represent 1/(1 — z) for |z| <1. Let the 


150 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


ae (Pe 3). 
~ nidz\1 —2z 


The series expansion of 1/(1 — z) about the point z = 7/2 should converge for 


V5 


a 


2 


reader show that 
n = 0, 1, 2, 








| a 
oo 


| 


since the point z = 2/2 1s v/5 /2 units 
from z = 1. We write 








1 | 1 
~ Jz a {41 
1-5 -( 3 
Ss aa ats feet 
7 1 z—1/2 
Reg ego 
_ 1 z2—1/2 
~ 1 ~2/2 > (; 1/2 29) 
n=O 
The circles of convergence, |z| = 1 and 
Fia. 4.11 ; /5 
2- gl? overlap (see Fig. 411). 





In this shaded region of overlapping, (4.24) and (4.25) converge to the same value of 
1/(1 — z). Equations (4.24) and (4.25) are said to be analytic continuations of 
each other. 


Problems 


1. Obtain the Taylor-series expansion of sinh z = (e* — e~*)/2 about z =0. The 
same for cosh z = (e* + e~*)/2 about z = 0. Show that cosh? z = 1 + sinh? z, 


2 sinh z = cosh 2z, 5 cosh z = sinh z. 


dz 
— n—loyn 
2. Show that Ln (1 +2) = y le 
n=] 
8. Show that > (z"/n!) converges for all z. Multiply the series > (z"/n') and 
n=O an 


(27 /n!) to obtain ete = eft, 
n=Q 
4. If one defines tan™! z by tan™! z = f ‘Tse oop rc what difficulties occur? How 


can one make tan™! z a single-valued function? 

6. Show that e?™ = 1 if n is an integer. If e*t* = et, show that a = 2rnt, n an 
integer. 

6. Define w = ~/z as that function w such that w? = z. Ifz = re’®, w = pe’, show 
that p = 4/7, ¢ = 2/2. Sincez = re\9+2"), show that g = 0/2 + also. Forz = 0, 
show that w is double-valued. Construct a Riemann surface so that w is single- 
valued. Is the origin a branch point? 

1. If w = Lnz, show that z = e*. Hint: Ln z = In |z| + 16, ew = elm lel +0, 


COMPLEX-VARIABLE THEORY 151 


8. Define w = 2%, a complex, by the equation w = z* = e#!»+_ How can one make 


w single-valued? Show that for this case = = ze}, 


9. Let f(z) be analytic for |z| < R, f(z) = y anz". Show that, for r < R, 
n=O 


f \f(re'9)|? dé = > lan|2r2" 


n=() 


a 
2n 


10. Let f(z) be analytic in a simply connected region R bounded by a simple closed 
path I, f(z) continuous on I, so that Cauchy’s theorem applies. Let zo be an interior 


point of R so that f(z) = > a,(z — 2o)". Let C be a circle with center at z) and 


n=Q 
radius r, C inside T. Show that |ao|? + |a:|?r? + |a2|?74 + --- S laol* if it is 
assumed that |f(zo)| 2 |f(z)| for all z in R. Hence prove the maximum-modulus 
theorem, which states that | f(zo)| 2 |f(z)| for all zin R implies z)on Tif f(z) ¥ constant. 


4.11. An Identity Theorem. Analytic Continuation. We have seen 
that, if a function f(z) is analytic at a point zo, a Taylor-series expansion 
exists. If one desires, then, one could define the class of analytic func- 
tions as the totality of series of the form > An(2 — zo)” with nonzero 

n=( 
radii of convergence. Some of the series would be analytic continuations 
of each other. Thus one could start with a particular analytic function 


f(z) = » a,(z — 20)" which converges for all z such that |z — zo| < r ¥ 0. 
n=) 

Now choose a point, 2;, inside this circle. Since f(z) is analytic at 21, we 

can find a series expansion for f(z) 


in the form > ba(z — 21)" which 
n=Q 
converges for all z such that ()) 
lz -—2|<7r1 ~0 
{<r 4 


This new region of convergence may 

extend beyond the original circle of 

convergence (see Fig. 4.12). 

This process can be continued. Fic. 4.12 

Each series 1s an analytic continua- 

tion of its predecessor, and conversely. All of them represent the original 
f(z), which has now been extended to other portions of the z plane. One 
might naturally ask, if a point z = ¢ is reached by two different paths of 
analytic continuation, do the two series representations thus obtained 


152 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


converge to the same value in their common region of convergence? We 
cannot answer this question until we prove Theorem 4.2. 

THEOREM 4.2. Let f(z) and g(z) be analytic in a simply connected 
open region R. Assume f(z) = g(z) for a sequence of points 2, 22, .. . , 
Zn, - - . having zo as a limit point, z,,2 = 0,1,2,...,n,...,imR. 
Then f(z) = g(z) in R. 

The proof proceeds as follows: First note that f(z.) = g(zn) so that 
f(zo) = lim f(zn) = lim g(zn) = g(zo). Moreover, since f(z) and g(z) are 


analytic at Zo, 


© 


f(z) = ) axle — 20)" 


k=0 


g(z) = > b,(2 — 20)* 


k= 


and f(zo) = g(zo) implies ay = by. Hence 


(2n — 20) > An(Zn — 20)*"! = (Zn — 2o) > bi(zn — 20)*7} 

k=) k=] 

forn = 1,2,3,.... This implies 
oo a0 
» Qi(Zn — Zo)47! = y Di (Zn — 20)*—! 

Ly 

k=1 k=] 

forn = 1, 2,3, .... Hence 


30 a 


lim 7 Qk(Zn — 20)*-! = lim > bi(2n — 20)*-? 


an ey bal 


which implies a; = b). By mathematical induction the reader can show 
that an = ba, n = 0,1, 25... . This shows that f(z) = g(z) for some 
neighborhood of zp. This neighborhood (a circle) extends up to the 
nearest singularity of f(z) or g(z). Now let ¢ be any point of FR, and 
construct a simple path Io from zo to ¢ lying entirely in R. Jet I be the 
boundary of RF (see Fig. 4.13). Call the shortest distance from Ip to IT, 
p. The radius of convergence of f(z) about z = z29is 2p. Why? Now 
choose a point z; on I interior to the circle of convergence of f(z) and 
g(z) about zo. Since f(z) and g(z) are identical in a neighborhood of 2, 
we easily prove in exactly the same manner as above that f(z) = g(z) 
for some circle of convergence about z; whose radius is greater than or 
equal to p. Let the reader show that z = ¢ can be reached in a finite 
number of steps. When ¢ is interior to one of the circles of convergence, 


COMPLEX-VARIABLE THEORY 153 


f(¢) = g(t), which proves the theorem. It is important to note that at 
each point Zo, 21, 22, . . . , 2 the actual given function f(z) and its deriva- 
tives are used to obtain the Taylor series expansion of f(z). Analytic 
continuation is not used since we are not at all sure that the value of 





Fig. 4.13 


f(z) at z = ¢ would be equal to the value at z = ¢ obtained by analytic 
continuation. That this is true for a simply connected region requires 
some proof. This is essentially the monodromy theorem, whose proof 
we omit. The formal statement of the monodromy theorem is this: 


Let R be a simply connected region, and let f(z) = > An(z — 20)” 
n=(Q 
have a nonzero radius of convergence. If f(z) can be continued analyt- 
ically from z along every path in F#, then this continuation gives rise to a 
function which is single-valued and analytic in R. 
We now state some consequences of the identity theorem: 
(a) The identity theorem holds for a connected region. 


(b) Let f(z) be analytic at z = zo. Let 21, ze, 23, . . . , Zn,» - - be A 
sequence of points which tend to 29 as a limit such that f(z,) = 0, = 1, 
2,3, .... Then f(z) = 0 in a neighborhood of z = 2p. 


(c) Let f(z) be analytic at zo, f(z) # constant. A neighborhood, JN, 
of z = zo can be found such that f(z1) # f(zo) for 2, in N, 21 ¥ Zo. 

(d) Let f(z) and g(z) be analytic in an open connected region R. 
Assume f (zo) = g™(zgo), nm = 0, 1,2, ...,2 in R. It can be shown 
that f(z) = g(z) in R. 

Problems 


1. Prove (a). 
2. Prove (b). 
8. Prove (c). 
4. Prove (d). 


154 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
5. Consider f(z) = sin my lz] <1. Show that f(z) has an infinite number of 


zeros in the region |z| <1. Does this contradict (b)? 

6. Let ¢(z) be a real-valued function of the real variable z,a < x Sb. Show that, 
if it is at all possible to continue ¢(z) analytically into the z plane, the continuation is 
unique. Hint: Assume f(z) and g(z) analytic for zreal,a S$ 2 3 b,f(x) = o(z) = g(r). 

7. Use the results of Prob. 6 to show that 


L- < 
gn 
f(z) = nt 
n=0 
is the only possible definition of e*. 
8. Consider z* = e%lx*, Define © as the operator of continuation by starting at 


z and returning to z, the path of continuation encircling the origin. Show that 
Qza = e2mrasa 


Hint’ Ln z = In |z| + 716. After encircling the origin In z = In |z| + 26 + 2m. 


a 


9. Let f(z) be analytic at z = z so that f(z) = > a,(Z — Zo)" converges for 
n=Q 
lz — zo] < R. Show that f(z) has at least one singular point on the circle z = z) + Re*?, 
0560827. Hint: Assume f(z) analytic at each point of the circle, obtain a circle 
of convergence at each point, apply the 
Heine-Borel theorem, and extend the radius 
of convergence, a contradiction. 


4.12. Laurent’s Expansion. A 
generalization of Taylor series is due 
to Laurent. Let f(z) be analytic in 
an annular ring bounded by the two 
circles K,, Ke with common center 
z=a. Nothing is said of f(z) inside 
Kk, or outside Ke (see Fig. 4.14). 
Now let z be any point in the annular 
ring, and construct circles C; and 
C. with centers at z = a such that 
z is interior to the annular ring 
lying between C,; and C2. Moreover 
C, and C2 lie in the annular ring between K, and Ky. From Cauchy’s 
integral formula and theorem 


1 foe _ 1 soa fins 


2rt Jc, £ — 2 2ZarvJar—ez 





Fia. 4.14 








f(z) = 


This result can be obtained easily by using the method found in Sec. 4.8B. 
Now 


ag 1 og f() at 
C: 


Of—Z2 5 — a) — (2 —a) ¢ —a)[1 — (2 — a)/(¢ — a)] 


COMPLEX-VARIABLE THEORY 155 





; Zz 
Since 





=| <r<1for ¢ on C2, we write 


a - (=) 
1 — (2 — a)/(f — a) a f-a 


and interchange the order of integration and summation because of the 
uniform convergence of the series. This yields 


20 


i de oY aale — a) 
Qri Jc: { —2 —. (4.27) 
a= 5 I) ae n=0Q0,1,2,... 


Qni Jc: (¢ — a)rt} 
For the second integral 


Md og far 
ac—z (. ({ — a) — (2 — a) 
7 fae 
(z — a)[l — (¢ — a)/(z — a)] 





f 


cs <s <1 for ¢ on Ci, we write 


1— ((—a)/(z — a) z—-a 
0 


n= 


Since 





Interchanging integration and summation yields 


oe 


Lr f(g) a » : 
——- = = An(z2 — a) e 
ee ee (4.28) 


“ibe (GF -a)¥r)dg n= 1,2,3,... 





~ Oni 


Hence f(z) 


il 
D438 
Q 

3 
—~ 
nN 
| 
Q 
eet 

2 
+ 

bh 

& 
| 
3 
os 
x 

| 
& 
— 

3 


Si(z — a) + So(z — a) 
> a,(z — a)” > a_n(z — a)" = S2(z — a) 


n=Q@Q n=] 


il 


where Si(z — a) 


Since (z — a)~“**f(z) and (2 — a)"~f(z) are analytic in the annular ring 
between K, and Ke, we can choose any path of integration to calculate the 
an, n = 0, +1, +2, ... , provided the path I encircles the point z = a 


156 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


exactly once. Hence we can write 


0 


f2) = an(z — a)” 
2. (4.29) 


1 d 
n=o5 QO, Pee N= "0. i. 2s es 
Equation (4.29) is the Laurent expansion of f(z) valid for all z in the 
annular ring between K, and Kez. 
We leave it as an exercise for the reader to show that S,(z) converges 
for all z inside Ke and that S.(z) converges for all z outside K,. The 
common region of convergence is the above-mentioned annular ring. 


Example 4.19. Consider f(z) = 1/(2 — 1)(z2 — 2). Certainly f(z) is analytic for 
all z such that 1 < |z| < 2. Let us find the Laurent expansion for f(z) in this region. 
It is not necessary to find the a, of (4.29). We wish to write f(z) as the sum of two 
series, one series converging for |z| < 2, the other for |z| > 1. We write 








EE ee ee ees: ee ee ee ee 
(g@—1)(2 —-2) 2-2 2z-1 21—2/2 z21—1/2z 
- on 1 
<9 Qnti gn 


n = () n=] 


The first series converges for |z| < 2 and the second for |z| > 1. Let us check this 
answer by using (4.29) to find the a,. 


1 dy 


On = Oni Pr (& — It — 2ygrri 


where I is any simple path enclosing the origin and lying between the circles {z| = 1, 
lz] = 2. We write, for n 2 0, 


1 A B C.D E 








oe emcee ARR A RR RN tA, SS eee reece bell —_— . ° . 4. 
C-DG Heap tye t tpt tp 80) 
A qk ‘Why? 
Hence Ont f (Nt =e Q)erF = A + C hy? 


To find A, multiply (4.30) by ¢ — 1, and then let ¢>— 1. Weobtain —1 = A. Itis 
easy to see that B = 1/2"+). To find C, multiply (4.30) by ¢, and let t~{— «. We 
have0 = 4+8+(C's0 that C = 1 —1/2"*)}, Hence 


2! dg gees 
= 550 opera F gr CO 
Let the reader show that a, = —1forn < 0. 


4.13. Singular Points. We had previously stated that zo is a singular 
point of f(z) if f(z) is not analytic at zo. If, moreover, f(z) is analytic 
for some neighborhood of zo with the exception of zo, we say that zo is an 


COMPLEX-VARIABLE THEORY 157 


isolated singular point. In this case the Laurent expansion of f(z) con- 
verges for 0 < |z — z0| < r, where r is the distance from 2» to the nearest 
singular point of f(z) other than zo itself. We distinguish now between 
3 types of functions which have isolated singular points. 


Case 1. The series f(z) = > an(z — zo)” 18 such that a, = 0, for all 
n=O 

n<0. Weneed only redefine f(z) at zo to be ay and f(z) becomes analytic 

at z= 2. A singularity of this type is said to be a removable singular- 


ity. Thus f(z) = > (2"/n!) for z ¥ 0, f(0) = 2 can be made analytic 


n=O 


at ¢ = 0 by defining f(0) = lim » (2n/n!) = 1. 
z—0 =O 


Case 2. All but a finite number of the a,, n < 0, vanish. In this 
case we Say that zo is a pole of f(z). We write 


f= Gs 


Z— 2)” 


ae Cr, an ~ 0 
“2 — £0 

k=0 
and 29 is said to be a pole of order n._ If 


Oates » ar(z — 20)! 
k=(0 


z = 29 is said to be a simple pole of f(z). 
The reader can easily verify that |f(z)| becomes unbounded as z — 2p 
if z9is a pole. A pole is likewise called a nonessential singular point. 
Case 3. An infinite number of the a,, n < 0, do not vanish. In this 
case we say that zo is an essential singular point. For example, 


] 1 
ant + re 


eal tot +... 
has an essential singularity at z = 0. An important property of an 
essential singularity is Theorem 4.3, due to Picard, which we state with- 
out proof. 

TuHEoREM 4.3. In any neighborhood of an essential singularity a 
single-valued function takes on every value, with one possible exception, 
an infinity of times. 

Let us consider e!” at z = 0 as an example. Are there an infinite 
number of z in any neighborhood of z = 0 for which e!” = e§°? The 
answer is ‘‘yes”! Since e?*"* = 1, we have e!/+?r"t = ¢5°: go the equality 


158 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


holds if 1/z, + 2rnz = 50, or z, = 1/(50 — 2xni). Let n = —1, —2, 
—3, etc., and we have an infinite sequence of z, which tend toz = 0 asa 
limit and such that e!* = e®°. Let the reader show that the same applies 
to e!* =%. The exceptional value stated in Picard’s theorem is zero. 
There is no z in any neighborhood of z = 0 for which e!“ = 0. Let the 
reader show this. 

We shall see in Chap. 5 that the point at infinity will play an important 
role in the development of differential equations in the complex domain. 
First we say that the point of infinity is an isolated singular point if f(z) 
is analytic for all z such that |z2| > R > 0. We then replace z in f(z) 
by 1/é¢ and investigate f(1/t) at ¢ = 0. The nature of f(1/t) at ¢ = 0 is 
defined to be the nature of f(z) at z = o. 


Erample 4.20. (a) f(z) = 1/z, f(1/t) = 4, so that, if we define f(o) = 0, f(z) is 
analytic atz = ©. (b) f(z) =z —1/z, f(1/t) = 1/t — t, which has a simple pole at 


t =(Q. We say, therefore, that f(z) has a simple pole at infinity. (c) f(z) = > =, 


n = () 
eo 


fa/t) = y ee Since f(1/) has an essential singularity at ¢ = 0 we say that f(z) 


n=() 
has an essential singularity at z = 0. 


Problems 


1. Find the Laurent expansion of f(z) = 1/(z? + 1)(2 — 2) for 1 < |z| < 2. 
1 


AG pete | SS er 


2. Find the Laurent expansion of f(z) = 
1 < |z| < 2; for |z| > 2. 

8. Show that the coefficients in the Laurent expansion (4.29) are unique. 

4. Find all the roots of e* = 1. 

6. Find a function f(z) which has a simple pole at z = 0, z = 1, and z = o. 

6. Define cosh z = (e* + e*)/2. Show that 


eo 
cosh (2+) = Ao + > an (2 +5 
~ n x ] 
2 


an = Z i " cos né@ cosh (2 cos 6) dé 
2r 0 


for |z| > 0. 


a. 
2— 





“ + > an(z ae’ 20)". We call 
0 

n=Q@ 
a_, the residue of f(z) at z = zo. Show that a_, = lim (z — 2o)f(z). 


7. Let f(z) have a simple pole at z = zo, f(z) = 


2— +20 
8. Let f(z) have a pole of order k at z = 2. Show that the residue a_; is given by 


a. = Ennis [(2 — 20)*f(z)] 





& = 29 


9. Find the residues of f(z) = z4/(c® + 2*)4 at its poles. 


COMPLEX-VARIABLE THEORY 159 


10. Find the residue of f(z) = 1+2+4+ 22% atz = o. 
11. Let f(z, t) = e(¢/2)(-1/s), Show that the Laurent expansion of f(z, ¢) for |z| > Ois 


[°] 


f(z, 8) J ,(t)2 


n=—@ 


2n 
where J n(t) (—-1)"J,.(- 0 = x |, cos (n@ — t sin 6) dé 


12. Let f(z) be analytic in the infinite strip given by —a < Imz <a, a>O. 
Assume f(z) = f(z + 2r). By use of Laurent’s theorem show that 


f(z) = » cae 


n= — © 


as i a —tnt 
oe = 5 i. fle)e™ dz 


13. Let f(z) be an entire function (analytic everywhere) with a pole of order n at 
z= 0, Obtain the Laurent expansion of f(z) for 0 S |z| < «©, and show that 


f(z) = ao + ayz + - + a,2" 


14. Let f(z) be analytic everywhere with the exception of a finite number of poles 
at 21, Z2, ..., 2 anda pole at z = «. The order of the pole at z, is a,, 1 = 1, 2, 
..»,. Consider 


ez) = (Zz — 21)%(z — Zo)% + © + (2 — 2) f(z) 


and show that f(z) is a rational function, the quotient of two polynomials. The result 
of Prob. 13 is useful. 

15. Let p(z) have simple poles at the finite points z = &, 7, ¢. Also assume that 
2z — 2°p(z) is analytic at z = ©. Consider 


P(z) = (@ — £)(z — n)(z — $)plz) 
and show that P(z) has a pole at most of the order 2 at z = o. Hence show that 


A B C 
Ree ea ee 








where A, B, C are constants. 


4.14. Residue Theorem. Contour Integration. Let z = zo be an iso- 
lated singular point of f(z). For f(z) single-valued and analytic in the 
region 0 < |z — zo| < R we have the Laurent expansion 


fe) = Y an(e — 20)" 


n=—e 


Let I be a simple closed path encircling z = zo lying in the region 0 < 
lz — zo] < R. Then 


160 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Ps y An(z — 20)" dz 


n= — © 


f fle) de 


Il 
Q 
2 
~oO~ 
yj 
-_ 
nN 
| 
nN 
S; 
a2 
& 
a 


= 2ri. The inter- 





since f, (¢ — z2o)"dz = Oforn # 


change of integration and summation can be justified. We leave this 
as an exercise for the reader. The constant a_, is called the residue of 
f(z) at z = Zo. 

Let the reader prove Theorem 4.4. 

THEOREM 4.4. Let R be a simply connected open set, and let I be a 
simple closed path in R. Let f(z) be single-valued and analytic in R 
with the exception of a finite number of isolated singular points. Then 


f.. f(z) dz = 2ri - (sum of residues of f(z) inside T) (4.31) 


The residue theorem (4.31) is highly useful in the evaluation of real 
definite integrals. We discuss some examples. 


o 86d 
Example 4.21. Let us evaluate | 


ely2 We deal naturally with 


y 








2). =< ; 3 
Now f(z) has a simple pole atz = 7. To 
apply the residue theorem, we look for a 
path containing z = 12 in its interior. At 
the same time we desire that part of the sim- 
ple closed path be part of the real axis. We 
—R R« choose as I the straight-line segment ex- 
tending from z = —R& to z = R and the 
upper semicircle |z| = R > 1 (see Fig. 4.15). 
Since f(z) = 1/(1 + 2?) has a simple pole at z = 1, the residue of f(z) at z =7 is 





Fia. 4.15 





. £-t 2-1 1 
lim ¥ + ta lim Moai =e a) = "5. (see Prob. 7, Sec. 4.13) Hence 
Rk dz « Rie'® de Jl 
[rest = (4.32) 
Rie*® dé Rr 
Now cx 1 + 1 + Rte? sk |, ji + + at © =I aun 





since |R%e*? + 1| > |R%e*9| —1 = R2?—1, R>1. If we allow R to become 
infinite, (4.32) becomes 
) az _ 
ee 


since lim Rr/(R? — 1) = 0. 
R- & 


COMPLEX-VARIABLE THEORY 161 


The method used above is known as contour integration. This method can be used 
oo 
to advantage to evaluate integrals of the type / f(z) dr. The choice of the closed 
— 2 


contour of integration is not always apparent. One does not always pick a semicircle 
as part of the contour. 
x sin z 


Example 4.22. Let us evaluate I~. ° oo 


(2 sin z)/(z? + a?) we consider f(z) = ze*/(z? + a?). Remember that 


* dr, a positive. Instead of dealing with 





e’*= = cosz +ising 


which introduces the term sin z._ eis much easier to handle than sin z. As the con- 
tour of integration let us use that of Example 4.21 with R > a, since f(z) has a simple 
pole at z = ai. We apply the residue theorem and obtain 


KR ze* ze* dz . fate» : 
dx ———, = 2m : = res 
a Raz + a? 7 cz +a? . 2ai m 





where C is the upper semicircle of Fig. 4.15. ; 
1 2orRe' 94 18 
On C, z = Re*®, dz = Rie‘? dé, so that SoS = eee. 
ze* dz 
cz2+a® 
as our contour of integration the rectangle with vertices at (—R, 0), (R, 0), (R, 
R + Ri), (—R, —R + kr). Then 


rer bd is (R + tyeRting dy im Riz 4+ Ri) e'e+R) dz 
i R2? Aaah? (R + iy)? + a? (1c + Ri)? + a? 
+ [2 SRE We te oy 


We see that it 





We abandon the semicircle and choose 





will be difficult to determine lim 











ae = rie~ 


(=R +i)? + a 


R 
Let the reader show that all integrals tend to zero as R— © except if a ata “3 so that 


2 6ze'* dx ae 
= 7 
—o x + a? 


[c zsin x dz 


— © g? + a’ 





—a 





and = re 


by equating the imaginary parts. 
Example 4.23. As an illustration of the residue theorem consider 


oo 


f(z) = (@ — 2)" > ar(z — 20)* ayo ~ 0 
k=0 


We say that z = z. is an nth-order zero of f(z). Wecan also write f(z) = (z — Z0)"¢(z), 
e(zo) ¥ 0, v(z) analytic in a neighborhood of z. Let the reader show that there 
exists a neighborhood of zo such that ¢(z) # 0 for all z in this neighborhood. Then 


f(z) = n(z — 20)" "'p(z) + (z — 20)"¢'(z) 
= (2 — 29)" [ny(z) + (2 — 20)¢'(z)] 


We see that z = 2) is an (n — 1)st-order root of f’(z) = 0. Moreover 


Oe f(z) _ neolz) + (@ — z)y'(z) _ on ¢' (2) 
me) = F@) (2 — Z0)e(2) z2—Z0 (2) 





162 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


so that z = 2, is a simple pole of g(z), since ¢(zo) # 0. Hence for any simple closed 
path I surrounding zo for which ¢(z) ~ 0 for all z inside T we have 


f(z) 
pre dz = 2rin 


Similarly, let f(z) = oo a ls. de 


— Zo)” Z— 20 





+ : ax(z — 29)* so that 


yare (2) 


(z — 2Z0)™ 


g(z) analytic in a neighborhood of z = zo, g(z2v) = @-m * 0. Let the reader show that 
Sz) /f(z) = —m/(z — 20) + ¢'(z)/e(z). mis called the order of the pole at z = 20. 
The reader should now be able to prove Theorem 4.5. 


THEOREM 4.5. Let f(z) be analytic in a simply connected set R with 
the possible exception of isolated singular points. Let C be a simple 
closed path in F# enclosing a finite number of these isolated singular 
points which are poles, f(z) ~ 0 on C. Then 


I f¢f@a _ 
Pao 7) @aN-P (4.38) 
where N is the number of zeros of f(z) inside C, the order of each zero 
counted in determining N, and P is the number of poles, the order of 
each pole counted in determining P. 


Problems 
0 2 
1. Show that [-. aoa = Sy Rla > 0. 
4 x? dz 
2. Evaluate Ve eee ae 
8. Integrate e/z around the rectangle with vertices atz = +R, +R + Ri indented 
at. the origin, and show that [ = is dz = x, see Fig. 4.16. 
-R+Ri- 7 R+Ri 





Fig. 4.16 
4. Prove (4.33). 
6. Let f(z) = aoz™ + az™"! + - + + + ay, and let C be a circle |z| =r such that 
\f(z)| > 1 for |z| > r. Show that on re dz =n. From (4.33) state a theorem 


concerning f(z) = 0. 


COMPLEX-VARIABLE THEORY 163 


6. Consider z = e°9 = cos 6 +1 51n 6. Show that cos @ = gz + 1/2), 
sin @ = (1/22) (2 — 1/z). 
Use this result to evaluate i ae 
0 + cos 6 
7. Integrate e~* around the rectangle whose sides are x = —R, zc = R, y = 0, 
y = b > 0, and show that 


@ ao 
if e~2" cos 2b dx = e7"? i et? dr 
Then show that 


[c er dr = ({- db a e~?* cos 2br ac) = Vr 
from (6.78). 


ee: ee . wa\" 
8. For 0 < a < 2 show that iin dr = (2 gin *) 
© 2 
ait te" ar = wn 2. 
dx w 


10. For a > 0 show that A ea! = 2/24! 
11. For a > 0, b > 0 show that 





9. Show that [, 


| ea la + 2b) 
wo (x? + a?) (x? + 62)? ~~ Qab3(a + b)? 


12. For a > b > 0 show that 


ec cost de Le (5 2 “) 
—~o (2? + a*)(x7? +b?) a? —b?\ 4b a 


4.15. The Schwarz-Christoffel Transformation. Let us consider a 
closed polygon in the w plane. We ‘5 
assume that the polygon does not 
intersect itself (see Fig. 4.17). EF, (t2,0.) 

We wish to find a function 
z= F(w) or w = f(z) which maps a7 
the polygon into the real axis of the 
z plane. Let P, be mapped into 
(z1, 0), P2 into (22, 0), etc. As we 
move along the polygon in the w 
plane, we move to the right along 
the x axis in the z plane. Now if 
such a transformation exists, then dw = f’(z) dz and arg dw = arg f’(z) + 
arg dz. Along the z axis, arg dz = arg dr = 0, so that 







EF (u,,%,) 


Fic 4.17 


arg dw = arg f’(z) 
along the polygon. Hence 


A arg dw = A arg f’(z) 


164 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


along the polygon, where A represents an abrupt change in the value of 
arg dw. This occurs, for example, at P;. As we turn the corner at P,, 
arg dw changes abruptly by an amount ay, —1 <a < +1. Now con- 
sider f’(z) = (z — 21)~*', a real, so that arg f’(z) = —a, arg (2 — 2). 
Thus 


—a,A arg (2 — 21) 
—a,larg.s;:, (2 — 2%) —- arge<z, (2 =a r1)] 
— a,(0 = T) = TA; 


A arg f’(z) 


This suggests that the transformation we are looking for satisfies 
d 
—- = f'(z) = A(z — 21)7(z — 22) + + + (2 — Sy) ~=— (4.34) 


The reader can easily verify that 
A arg dw = A arg f’(z) = max K=1,2,...,n (4.35) 
It can be shown rigorously that 
we) = A [ (2 — mi) — a2) + (2 — ede + B (4.36) 


is the required transformation. A and B are constants in the Schwarz- 

y Christoffel transformation given by 
(4.36). It can also be shown that 
the interior of the polygon maps 
into the upper half plane of the z 


axis. Since » a, = 2, the reader 
cml 

can easily verify that w(o) and 

a ae w(—-«) exist. By integrating 

era (4.34) around the closed path of Fig. 

4.18, allowing the radii of the small semicircles to tend to zero, and allow- 

ing the radius of the large semicircle to become infinite, the reader can verify 

that w(— ©) = w(@), not neglecting the fact that -—1 < a; < +1,7 = 1, 
ae (2 

If we desire that x, be the point at infinity, we define z = z, — 1/f. 

When z = 22, £ = ©. Moreover dz = (1/f?) dt so that (4.36) becomes 


etd Pig caw Sa Ve ues lle ema cece ys ee 
w(z) Af («, Li ') (2 Ln 1) 2 +B 


n 


: Ya 
=A, [ (= ay)-E — ay) Fama mt ZB 


vy Lo QO %3 °° In 2 


= A, [ (2 — @1)~™(@ — @e)~% + + + (@ — Ane)“ de + B (4.37) 


0 


COMPLEX-VARIABLE THEORY 165 


since ) a = 2. Thea,z=1,2,...,n—-— 1 are constants. Equa- 
i=l 


tion (4.37) is exactly of the form of (4.36) with the term z, = © omitted 


Example 4.24. We consider the polygon of Fig. 4.19. 

Two of the vertices of the polygon are at —1/2 + 1, 7/2 + 01. Let us map P 
into (—a, 0) and Q into (a, 0) of the z plane. At P, a1 = $, at Q, a2 = %, so that 
(4.36) can be written 


Vv 


wer eA Jy Gt OG ujbeet B 


The transformation z = a sin u, dz = a cos 
udu, V2? — a? = ai cos u, yields 





=4om i 4B 


From the conditions (2 = —a,w = —7/2), 
(2 =a, w = 1/2) we have 





w = sin7 = z=asinw (4.38) Fia. 4.19 


If we let w = 7/2 +.R, then z =acos (tk) =acosh R. AsR~ ~, z> ~, 
Similarly if we let w = —1/2 +72R, we see thatz— © asR— «. The transforma- 
tion (4.38) unfolds the polygon. Forz =r + 2y, vw = u +10, (4.38) becomes 


z+ =asin (u + w) = asin u cosh v + 2a cos u sinh v 
x = asin u cosh v 
y = acos u sinh v 
Hence 2?/(a? sin? uw) — y?/(a? cos? u) = 1 so that the straight lines u = uo # 0 } 
a/2 
map into hyperbolas in the z plane. Similarly the straight lines v = ve map into 
ellipses in the z plane. 

Example 4.25. The Schwarz-Christoffel transformation is very useful in solving 
certain problems in two-dimensional electrostatic theory. First let us note that for 
an analytic function, w = f(z) = u(z, y) + iv(z, y), we have V?u = 0 and V*v = 0, 
(see Prob. 1, Sec. 4.6). If v(x, y) is the electrostatic potential, then V7 = 0. The 
lines of force, u(z, y) = constant, are at right angles to the equipotential curves, 
v(z, y) = constant. But for the analytic function, w = u + iv, we know that the 
curves u(z, y) = constant intersect the curves v(z, y) = constant at right angles 
(see Prob. 1, Sec. 4.6). Thus, to solve a two-dimensional electrostatic problem, we 
need only find w = f(z) = u + wsuch that v(z, y) satisfies the electrostatic boundary 
conditions. Once this is done, E = —Vv yields the electric field. Moreover 


Now let us consider an infinite line charge g whose projection in the zy plane is the 
origin. It is easy to show that V2u = 0, u = u(r), r = (x? + y?)4, has the solution 











166 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
u(r) = 2ginr. If we consider w = 2q: In z, we have 
u + iv = 2q1 In (ret®) = 12g In r — 26 


so that v = 2g In r is the required potential. If the charge were placed at zo, then 
w = 2qi ln (z — 20). A similar line charge, —gq, placed at Zo yields 


w = —2qi In (2 — 2p) 
The field due to both charges can be obtained from 


w = 2qz ln (z — 2) — 2qz in (z — %o) 
or 





= &£—~ 20 
w = 2qi In ao (4.39) 


If we consider a point on the z axis, z = z, then 


utw = 2%9iIn 
TT — 20 


I 





Ww 


= = = is real, so that v = O for the z axis. Hence 
—™ #0 

(4.3) will yield the electric field for an infinite grounded plane (the z axis) due to an 
infinite line charge placed at z.. Remember that the imaginary part of w of (4.39) 


satisfies Laplace’s equation and satisfies the boundary condition v = 0 when y = 0. 





Let the reader show that 2q7 In 


Now we consider a more complicated example where we shall make 
use of the Schwarz-Christoffel trans- 
formation. Consider two semi- 
infinite grounded planes intersecting 
at an angle gy. We find w(z) for an 
infinite line charge g placed at Zo 
(see Fig. 4.20). First we find the 
transformation which maps the pol- 
ygon AOB into the line v(z, y) = 0, 
arent which is the vu axis. We need only 
Fic. 420 ° apply the Schwarz-Christoffel trans- 
formation (4.36) with z and w inter- 
changed. We map z = 0 intow = 0. a, at z = 0, is easily seen to be 
(x — v)/n = 1 — ¢/x, 80 that dz = Aqw*/*—' dw and z = Aw*’, w = Bz*/*, 
The charge at zo maps into wo = Bzot’*. The complex function for an 
infinite grounded plane with charge at wo has been worked out above. 
It is 





W=U+iV = 2g Int” 
W=—- Wo 


gle eas Zor’? 








so that W=U+1V = 27 In (4.40) 


2t/e cages Zot’ 


V(z, y) is the required potential function. 


COMPLEX-VARIABLE THEORY 167 


As a special case let g = 7, 29 = rot. Equation (4.40) becomes 


‘ : 2 — Pot 
U+1V = 2a In = a 

= 2gi[In (2 — rot) — In (2 + 1ro2)] 
2qi[In jz — trol + 7 arg (2 — tro)] 
—2qi[In |z + iro| + ¢ arg |z + trol] 
2g[In jz — aro] — In |z + trol] 
x + (y — ro)’ 
x? + (y + 10)? 





Hence V 


= gin 


For y = 0, V = qinl = 0. 
Problems 


1. What electrostatic problem is solved by the transformation (4.38)? 

2. Map the rectangle of Fig. 4.21 with two vertices at infinity into the real axis of 
the z plane. Show that w = (a/r) cosh™ z. 

8. Find the electric field due to a charge gq placed midway between two infinite 
grounded planes (see Fig. 4.22). 


(w) 





VIVIVIIIAIIIIAAIADAITVIIGATA'?IVTA7AITTD 777777 
Fic. 4.22 





Fig. 4.23 


4. Map the polygon of Fig. 4.23 into the real axis of the z plane, and show that 
w= (h/x)(V2? — 1 + cosh z). 

5. Consider a closed curve C given parametrically by z = f(t), y = g(t),a Sts b. 
Consider the transformation z = x + iy = f(w) + ig(w). Show that under this 
transformation the closed curve C of the z plane maps into the real axis of the w plane. 


168 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


4.16. The Gamma Function. The Beta Function. It was Euler who 
first obtained a function of the real variable x which is continuous for z 
positive and which reduces to n! when x = n, a positive integer. We 
consider 


T(x) = fi ° e-4e—1 dt (4.41) 


The existence of the integral depends on the behavior of the integrand at 
t= OQOandatt= x. We first write 


(ir) = I, * et dt + I e~t7—1 dt 
For z > 0 we have that e-‘t?-! behaves like ¢—! for ¢ near zero. It is 
well known that f ' t7-1 dt exists for x > 0 since 
1 
: . 1 Pad 1 e>0O 
r—l1 _ seeks ies yeas _- — 
im | i im (2 <) zz >0 


To investigate the second integral, we note that 


e—tf7-1 = Cente) lim et/2z—-1 — () 


t> ~ 


Hence for ¢ = 7’ there exists a constant A such that |e—t?-| < Ae-‘”?, 
Since = Ae” dt exists, of necessity i ” e-4=-1 dt exists. We now inte- 
grate (4.41) by parts and obtain 





z |t= ad x 
T(z) gee + ot | at x>0 
Lv \t=0 0 L 
a1 [wa =tr@ty (4,42) 
wt /O LX 


Thus "(a + 1) = 2I(x). If x is a positive integer, 2 = n, we have 
Tin + 1) = nT (n) = n(n-— 1)T(n — 1) and, by repeated applications, 
we obtain 


Tin +1) = n(n — 1)(n — 2) ++ > 2-171) =n! (4.43) 


since (1) = i= edt = 1. 
The gamma function of a complex variable z is defined by 


T(z) = [ ° ett dt (4.44) 


t?-! is defined as ¢7—! = el" = e@-l) Lat where Ln ¢ is the principal value 
of Ind. By the same reasoning as above one deduces that I'(z) exists for 
Rl z > 0, and I'(z + 1) = 2T(z). 


COMPLEX-VARIABLE THEORY 169 


We can write 


1 °0 
T(z) = I et! dt + i e-tt2—! dt 
0 1 
1 . — 1)\kpk = 
= | fe-! » a t dt + | e—'f=-! dt 
0 a : 1 


1 = __ 1 )kpet+z~1 00 
= | » (Ve dt + | e~ttz-! dt 
° k=0 : 


Y (—1)Feetenh 
Since a, converges uniformly on (0, 1) for Rl 2 > 0, we can 
k=0 


interchange the order of integration and summation. Hence 


ro pone nf an 
1 


co ae —tfz—1 
Ay KE +B) =) + e—t?—! at (4.45) 











a 


Now | e—'t?—! dt exists for all z, and ; NOE converges for all z 
k=0 

other than z = 0, —1, —2,...,-—n",.... Hence T(z) defined by 

(4.45) is analytic everywhere except at z = —n,n =0,1,2,.... At 

z= —n, I(z) has a simple pole. I (z) defined by (4.45) is the analytic 
continuation of I'(z) as defined by (4.44). 

The function g(z) = I'(z2)'(—z) has simple poles at z = 0, +1, +2, +3, 

. and is analytic elsewhere. The function 1/(sin 7z) has simple poles 

atz = 0, +1, +2, .. . andis analytic elsewhere. It can be shown that 


T(z)T(dl — z) = (4.46) 





sin 3rz 


Now the right-hand side of (4.46) never vanishes. Hence, if a zo exists 

such that I'(zo) = 0, then, of necessity, ['(1 — ze) could not be finite so 

that 1 — zo is a pole of T(z). Hence 1 — 29 = —n and z~=1+7, 

T(z.) = TU +27) = n! € 0, a contradiction. Thus I(z) has no zeros. 
We state without proof Legendre’s duplication formula, 


a/xT (22) = 2% 0 (z)P(z + 2) (4.47) 


170 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


An important result is Stirling’s approximation to n! for n large. It is 
= a/2en (”) (4.48) 


This result will be obtained in Sec. 10.25 by use of the Euler-Maclaurin 


sum formula. 
The beta function defined by 


B(p, q) = i *-\(1 — £)9-1 dt (4.49) 


is closely related to the gamma function. The convergence of (4.49) 
exists if Rl p > 0 and Rlq>0. We choose 


{p-1 ass e(p-)) Ln ¢ (1 as t)@-} —_ e(a-) Ln (t—1, 
One can show with some labor that 


I 
B(p, 9) = oat ee ' (4.50) 


The substitution u = (1 — t)/t ort = 1/(1 + u) in (4.49) yields 


uP—! du 
B(p, q) = f gt ao y)Pra 
? uz} 
Thus N(z)r(l — z) = [ a4 du 0< Riz<1l 


The substitution u = e' yields 


et 


r(z)T(1 ome z) = . Tpet 


oo 


To evaluate | at one applies the theorem of residues to the 


et 
~ol +e 
closed rectangle C' with vertices at / = —R,t = S,t = S + 2mi, 
= —R + 2m 


R and S real and positive, ¢ the complex variable of integration. The 
pole of e/(1 + e*) inside this contour exists at t = ri. The residue of 
e*/(1 + e') at t = mt is 





lim = vet _ peed ec ree 
ton 1 +e toni tef em 


By letting R and S tend to infinity and using the fact that 0 < Rlz < 1 
one can obtain I'(z)P'(1 — z) = m/sin wz [see (4.46)]. By analytic con- 


COMPLEX-VARIABLE THEORY 171 


tinuation it follows that (4.46) holds throughout the domain of definition 
of 1/sin xz. 


Problems 


1. Show that I'(g) = V/x. 
2. Prove that (2n)! = 29°n!T'(n + ¥)x74. 
3. From (4.45) show that lim (z + n)I'(z) = (—1)"/n!, n a positive integer. 


mn 
4. Show that 
w/2 1 
i sin??~! @ cos?¢ 1 @6dd = xB(p, q) 


for Rl p > 0, Rl gq > 0. 
5. Integrate t7~!e— around the complete boundary of a quadrant of a circle indented 
at the origin, and show that 


2 
I t2~le—tt dt = e” F2/2]*(z) 


for 0 < Rlz < 1. 
6. Show that 


I, edz = 41(5) 


REFERENCES 


Ahlfors, L. V.: ‘Complex Analysis,” McGraw-Hill Book Company, Ine., New_York, 
1953. 

Churchill, R. V.: ‘‘Introduction to Complex Variables and Applications,” McGraw- 
Hill Book Company, Inc., New York, 1948. 

Copson, E. T.: “Theory of Functions of a Complex Variable,” Oxford University 
Press, New York, 1935. 

Knopp, K.: “Theory of Functions,” Dover Publications, New York, 1945. 

MacRobert, T. M.: ‘Functions of a Complex Variable,” St. Martin’s Press, Inc., 
New York, 1938. 

Phillips, E. G.: ‘Functions of a Complex Variable,” Interscience Publishers, Inc., 
New York, 1943. 

Titchmarsh, E. C.: ‘The Theory of Functions,”’ Oxford University Press, New York, 
1932, 

Whittaker, E. T., and G. N. Watson: ‘‘A Course of Modern Analysis,’’ The Macmillan 
Company, New York, 1944. 


CHAPTER 5 


DIFFERENTIAL EQUATIONS 


5.1. General Remarks. A differential equation is any equation involv- 
ing derivatives of a dependent variable with respect to one or more 


independent variables. Thus 
dy 


ca +49 7 “4+ 5y = sin (5.2) 
ae 079 
(4 + v) +x = eV (5.4) 


are classified as differential equations. Equation (5.3) is called a partial 
differential equation for obvious reasons. The others are called ordinary 
differential equations. One may have a system of differential equations 
involving more than one dependent variable [see (5.5)], 


Oa Ae tay 

at? dt 
a (5.5) 
a ¥~? 


In addition to their own mathematical interest differential equations 
are particularly important since the scientist attempts to describe the 
behavior of certain aspects of the universe in terms of differential equa- 
tions. We list a few of this ae 


” a s+ Ro ; + lx = f(z) (5.6) 
LS oars Ro = + din eer (5.7) 
wat ie — Viale = (5.9) 

” Re =0 (5.10) 


172 


DIFFERENTIAL EQUATIONS 173 


The order of the highest derivative occurring in a differential equation 
is called the order of the differential equation. 

5.2. Solution of a Differential Equation. Initial and Boundary Con- 
ditions. Let a differential equation be given involving the dependent 
variable ¢ and the independent variables x and y. Any function ¢(z, y) 
which satisfies the differential equation is called a solution of the differ- 
ential een ee — it is easy to prove that ¢(z, y) = e* sin y 


satisfies V’o = e4+o¥ = 0. We say that e*sin y is a solution of 


7 
Ve =0. It is Pate to realize that a differential equation has, in 
general, infinitely many solutions. For example, y’’ = 0 admits any 
function y = az + b as a solution, where a and b are constants which 
can be chosen arbitrarily. To specify a particular solution, either initial 
conditions like 

y=2,y' =1 when z = 8 


or boundary conditions like 
y = 2 whenz =3 and y= 4whenz = —1 


must be given in addition to the differential equation. In the first case 
y = x — 1 is the solution, and in the second case y = —j7 + ¢ is 
the solution. A full study of a differential equation implies the deter- 
mination of the most general solution of the equation (involving arbitrary 
elements which may or may not be constants) and a discussion of how 
many additional conditions must be imposed in order to fix uniquely the 
arbitrary elements entering in the general solution. Without attempting 
an exact statement or proof, at the moment, we state the fact that ‘‘in 
general’’ the most general solution of an ordinary differential equation 
of order n contains exactly n arbitrary constants to be uniquely deter- 
mined by vn initial conditions. 


Problems 


1. Verify that the following functions are solutions of the corresponding differential 
equations: 


() yaatoe thy tay" ~y tee = el — er 
07u 7 Ou ou : 
ox? 7 ax ay rae) 


()ytz+1=0 (y — x)y’ — (yy — 2) = 0 
2. Verify that y = ae? + be? is a solution of the differential equation 
y” — dy’ + 2y = 


Determine the particular solution which satisfies the initial conditions y = 0, y’ = 3 
for z = 0. 

3. Find the solution of y’ + 3z2e7 = 0 which satisfies the initial condition y = 1 
when z = 1. 


()u=2?—-y 


174 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


5.3. The Differential Equation of a Family of Curves. The family of 
straight lines in the plane is characterized by the equation y = az + b, 
where a and 6 are arbitrary constants. To fix these constants means to 
select one member of the family, that is, to fix our attention on one 
straight line among all the others. The differential equation y’’ = 0.is 
called the differential equation of this family of straight lines because 
every function of the form y = az + b satisfies this equation and, con- 
versely, every solution of y’’ = 0 is a member of the family y = az + b. 
The differential equation y’’ = 0 characterizes the family as a whole 
without specific reference to the particular members. 

More generally, a family of curves can be described by 


Y = f(z, @},@2,... , An) 


or implicitly by F(z, y, ai, a2, . . . , Qn) = O, in which n arbitrary con- 
stants appear. The differential equation of the family is obtained by 
successively differentiating n times and eliminating the constants 
between the resulting n + 1 relations. The differential equation that 
results is of order n. 


Example 5.1. To find the differential equation of the family of parabolas 


y = ar + bx? 
we differentiate twice to obtain 

y =a + 2bz 

y'’ = 2b 


The last equation is solved for b, and the result is substituted into the previous equa- 
tion. This equation is solved for a, and the expressions for a and b are substituted into 
y = ax + bz’. The result is the differential equation 


y = ry! — goty” 


The elimination of the constants a and b can also be obtained by considering the 


equations 
za + 7*b + (-y)1 = 0 
ua + 2rb + (—y’)1 =0 
2b + (-y")1 =0 


as a system of homogeneous linear equations in a, b, 1. The solution (a, 6, 1) is non- 
trivial, and hence the determinant of the coefficients vanishes. 








x rt —y 
1 2x —-y' | =0 
0 2 —y" 


Expansion about the third column yields the result above. 
Problems 


Find the differential equations whose solutions are the families 
2. y = C, cos 2x + C: sin 2z 


DIFFERENTIAL EQUATIONS 175 


3. y = e(C, + Caz) 

4. Find the differential equation of all circles which have their centers on the y axis. 

6. Find the differential equation of all circles in the plane. 

6. Find the differential equation of all parabolas whose principal axes are parallel 
to the z axis. 

7. Find the differentia] equation of all straight lines whose intercepts total 1. 


5.4. Ordinary Differential Equations of the First Order and First 
Degree. Let y be the dependent variable, and let x be the independent 
variable. The most general equation of the first order is any equation 
involving z, y, and y’. If y’ enters in the equation only linearly, that is, 
only in the first power, the equation is said to be of the first degree. Such 
an equation can be written in the form 


M(z, y) 


dy _ a 
dx = f(x, y) = N(z, 1/) (5.11) 


We shall see later that under suitable restrictions on f(z, y) there always 
exists a unique solution y = g(x), such that yo = ¢(xo) and 


ae(o) 





= F(z q, g(x)) 


The discussion of (5.11) will be restricted to the simplest cases at 
present: 


(a) f(x, y) = a(x)B(y). 
(b) f(z, y) is homogeneous of degree zero, that 1 is, f(tz, ty) = if(z, y), 


n = 0. 
(c) Exact equations. 
(d) Integrating factors. 
(e) f(z, y) = —p(z)y + Q(z). 
Case (a): Separation of Variables. If 


dy = M(x) 





dz ~ Ny) (5.12) 
then M(x) dx + N(y) dy = 0. Consider 
F(z, y) = [ M(x) dx + [’ N(y) dy = constant (5.13) 


The integrals in (5.13) are indefinite (no lower limit). We wish to show 
that y as a function of z given implicitly by (5.13) satisfies (5.12). We 
have 








OF , oF dy _ dy __—~ OF /dx oF £0 
ax | dydz dx oF/oy dy 
oe oF dy M(x) ; 
But cree M(x), By = N(y), so that a NG) Q.E.D. Con 


176 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


versely, let y = y(x) be any solution of (5.12), so that 
dy(x) _ M(x) 


“dr ~~ NGG) ni 
Consider F(z, y(x)) = [ M (x) dx + N(y) dy. We have 
dF TUS) 


=) 





= = M@) + N@) 4 
from (5.14) so that F(z, y(x)) = constant. Hence any solution of (5.12) 


can be obtained from (5.13) by solving implicitly for y in terms of z. 


Example 5.2. We solve ydzr + (1 + 2?) tan''2dy = 0. Separating the vari- 
ables, we have 
dx dy 


(1 + 2?) tan7'z T Yy = 


Integration yields ln tan! z + In y = C, = In C280 that y tan-! x = C2 or 


= (.(tan~! r)7? 
Problems 
Solve: 
1. dx — Va? — x? dy = 0 
2. y2cos Vrdxe +2Vre dy =0 
3. r(y? — 1) dr + y(r? ++ 1) dy = 0 
4. zy SE = 1 + y? 


6. For what curves 1s the portion of the tangent between the axes bisected at the 
point of contact? 

6. Find the general equation of all curves for which the tangent makes a constant 
angle ¢ with the radius vector. 

7. Find the function which is equal to zero when z = 1, and to 1 when z = 4, and 
whose rate of ee is inversely proportional to its value. 


8. Find r(z) if ee = Vai — rt, 


9. Find the an of curves intersecting the family of parabolas y? = 4pz at right 
angles for all p. These curves are called the orthogonal trajectories of the system of 
parabolas. 

10. The area bounded by the z axis, the arc of a curve, a fixed ordinate, and a 
variable ordinate is proportional to the arc between these ordinates. Find the equa- 
tion of the curve. 

11. Assume that a drop (sphere) of liquid evaporates at a rate proportional to its 
area of surface. Find the radius of the drop as a function of the time. 

12. Brine containing 2 lb of salt per gallon runs at the rate of 3 gpm into a 10-gal 
tank initially filled with fresh water. The mixture is stirred uniformly and flows out 
at the same rate. Find the amount of the salt in the tank at the end of 1 hr. 

18. It begins to snow some time before noon and continues to snow at a constant 
rate throughout the day. At noon a machine begins to shovel at a constant rate. 
By 1400 two blocks of snow have been cleared, and by 1600 one more block of snow is 
cleared. What time before noon did it begin to snow? Assume width of street a 
constant. 


DIFFERENTIAL EQUATIONS 177 


Case (b). We say that f(z, y) is homogeneous in z and y of degree n 
if f(tz, ty) = if(a, y). For example, f(z, y) = 2? + y? is homogeneous 
of degree 2 since f(tz, ty) = t?2? + @y? = (a2? + y*) = Uf(z, y). We 
now consider 


d 
= = fey) (5.15) 
where f(tz, ty) = tf(z, y). We let y = ¢z, so that dy =t+ oe 
ore dx dx 


Moreover f(z, tz) = z"f(1, ¢) so that (5.15) becomes 
t+2H = rfl, 0) (5.16) 
ee = xf(l, 


Ii n = 0, that is, if f(z, y) is homogeneous of degree zero, then 


at ax 


apo = ee (5.17) 


and the variables have been separated. We can solve for t = ¢t(z), and 
y = x(x) satisfies (5.15). In particular, if 


dy _ M(z, y) 
dx N(z, y) 





and M(z, y), N(x, y) are homogeneous of the same degree, then M/N is 
homogeneous of degree zero, so that the substitution y = tx yields an 
equation in ¢ and x with variables separable. 


Example 5.38. Consider 





f(x, y) = (@ + y)/(x — y) is homogeneous of degree zero. Let y = tz, so that 


dé _ x«+ttr 1+ 


pe ape ey 
1—t dx 
Peeo ee 


tan! t — $1ln (1 + ¢@) =Inz+1nC 
—Y¥Y_41 u) - 
tan” 3 In (: t+ In Cx 





Problems 
1, 4 = y* 
“dx x? — xy 
2 dy_ytvet+y 
* dx x 
gy tty 


dx x 


178 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


dy _ttytl , . = Ff = 
a gag eo Hint: Let r = Z +a, y +6, and find a and b so that 


dy ity 
di 2£-y 


6. Discuss (5.15) if f(1, ¢) = ¢in (5.17). 


Case (c): Exact Equations. We consider 





dy _—M(a, y) 
dx = N(z, y) 
or M(x, y) dx + N(a, y) dy = 0 (5.18) 


We say that Eq. (5.18) is exact if there exists a function ¢(z, y) such that 
dy(x, y) = M(a, y) dx + N(z, y) dy (5.19) 


If (5.18) is exact, then dy = 0 and g(z, y) = constant is a solution of 
(5.18). From the calculus 


0 0 
dy = ae dz + a dy (5.20) 


so that for (5.18) to be exact we must have 





0 
5. = Moz, y) 
de (5.21) 
EY oa N(a, y) 
Further differentiation yields 
0? OM 3 0 
Cee ee ee (5.22) 


OY Ox ay Ox Oy Ox 


If we assume continuity of the second mixed partials, of necessity 


oe (5.23) 
Equation (5.23) is a necessary condition that (5.18) be exact. 
Conversely, assume - = ox - We show that a function ¢(z, y) exists 
such that dp = Mdzx+ Ndy. Consider 
oz, y) = [* M(z, y) de + [* N(@e, y) dy (5.24) 


Zo and yo are arbitrary constants.. Then 


DIFFERENTIAL EQUATIONS 179 


dp _ 
Ox - M(a, y) 
dg | 7aM 
ay ne [ Mar + Ne y) 
7 aN 
= [BX ae + ew y 
= N(z, y) — N(xo, y) + N(ro, y) 


= N(z, y) 


Hence (5.24) yields the required function g(z, y). 

If this seems familiar, there is a good reason, since the same material 
was covered in Chap. 2, Vector Analysis. Let f = M(z, y)i + N(a, y)j. 
Thenf-dr = Mdx+Ndy. Iff = Vo, thendp = Mdx+Ndy. But 
f = Ve implies V X Ve = 0, which is statement (5.23). 


Example 5.4. Consider 





(nz +iny+1)dr +7 dy =0 (5.25) 
M=)\n r+1n y +1, Zi = z a5; ia SD es ve so that (5.25) is exact 
ayy y or sy 


Applying (5.24) yields 
zr 
g(7,y) = i (dInz2+iny+1)dr + f’ : dy = constant 


Integration yields 


"pring, ae 





=2r r 
| teh ting Pm 


or ginz-—-r+1+elny—Iny+r—14+lny=c 
xin (xy) =c 


wv 
z 








zx 
g(z, y) =azlnez gee 


We chose the lower limits to be 1 rather than zero since In z is not defined at x = 0. 
The reader can check easily that 


d[z In (ry)] = (nz +Iny +1) dr + dy 


Problems 


1. (ze* + 2 cos y) dx — 2x sin ydy = 0 
x dy 


Y 
._ dr —- —*. 
ty ety? 
8. sin y e7 "4 dr + (xe™™"" cos y — 2 cos y sin y) dy = 0 


Case (d): Integrating Factors. It may readily turn out that (5.18) is 
not exact, that is, a x . Let us assume that this is so. Perhaps 


one can find a function u(x, y) such that 
uM dx + uNdy =0 (5.26) 


180 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


is now exact. We call u(x, y) an integrating factor of (5.18). The solu- 
tion of (5.26) is also a solution of (5.18) since in both cases ou = — cf 
In general it is very difficult to find an integrating factor. There are two 
cases, however, for which an integrating factor can be found. Let us 
determine the condition on M and N such that u» = u(y) is an integrating 
factor. Of necessity 


0 0 
ay (u(y) M) = = (u(y) N) 
du OM _ aN 
and dy M + bw Oy = pe Or 
ldu _ aN/ax — aM /ay 
so that a (5.27) 


If aT eT is a function of y only, then (5.27) can be solved for y, 


for in this case 


aN _ aM 
u(y) = exp ae dy 
Example 5.5. Consider 
—ydr +xrdy = 0 (5.28) 
We have am = —], an = ], so that (5.28) is not exact. However, 


aN/dt — 0M/ay 2 
M — Yy 


so that an integrating factor exists, namely, 
2 1 
=e as —d = pw~2ingy = pling? — 2 
u(y) = exp ( | 7 v) e e 7 
Multiplying (5.28) by 1/y? yields 
1 x 
—-dr+—dy =0 5.29 
i 7 yy (5.29) 
The reader can easily verify that (5.29) is exact. The solution of (5.29) isz/y = con- 


stant, which is also a solution of (5.28). 
For an integrating factor of the type u(x), one needs of necessity 


eh (ay — ON ioe iets) (5.30) 


Problems 


1. 2zy dx + (y? — 32x) dy = 0 
2. (x — zy”) dx + xy dy = 0 


DIFFERENTIAL EQUATIONS 181 


8. Prove the statement concerning (5.30). 
_¥ 2y _ ~) = 
4. (3 9) de +(4 *) dy =0 


Case (e): Linear Equations of the First Order. The equation 


oy + p(x)y = Q(z) (5.31) 


is called a linear differential equation of the first order. The expression 
‘linear’? refers to the fact that both y and y’ occur linearly in (5.31). 
The functions p(x) and g(x) need not be linear in x. If g(x) = 0, (5.31) 
is said to be homogeneous; otherwise it is inhomogeneous. 

The solution of the homogeneous equation 


dy _ 
de + p(x)y = 0 
is trivial since the equation is separable. Hence 


y = A exp [—Jp(z) dz] (5.32) 


where A is a constant of integration, {p(x) dz is an indefinite integral. 
Now we attempt to use (5.32) in order to find a solution of (5.31). We 
introduce a new variable z(z) by the equation 


y = z exp [—Jfp(z) dz] 
or z= y exp {p(z) dx 


where y(zx) 18 a solution of (5.31). Then 


d d 
= = e exp | p(x) dx + yp exp | p(x) dx 
But oy = g(x) — p(x)y from (5.31) so that 
dz 
ay = (GY — PY) exp | pas + py exp | pa) dx 
= g(x) exp | p(x) dx 
Hence a(x) = | q(x) (exp | p(x) iz) dx +C 
so that 


y(x) = exp| - | p(x) az | || q(x) (cxp iE ax ) dz + c| (5.33) 


The reader can easily show that (5.33) is a solution of (5.31). 


182 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Example 5.6. Consider 


dy ; 
qe ty cotz z* csc & 


Here p(r) = cot z, g(z) = x* cscr so that (5.33) becomes 


exp (- [ corzar)[ f zt cee (exp f cot rar) de + C] 


== ¢@ In sins ch x? CBC reln sings az + a) 
J ‘ 
-.(f arte) 
sin 7 
= ara tS) 
sing \3 


Example 5.7 We now investigate (5 33) in more detail. We have 


y(z) = C exp [—fp(r) dz] + exp |— Jfp(x) dr][jq(z) (exp Jp(z) dz) dz] 
= yi(z) + y2(z) 


i 


y(zx) 


We notice that y:(z) = C exp [—Jp(z) dz] 18 the general solution of the homogeneous 


equation ou + p(z)y = 0. Moreover y.2(z) is a particular solution of (5.31) obtained 


from (5.33) by choosing C = Q In other words we have Theorem 5.1. 


THEOREM 5.1. The most general solution of the inhomogeneous equa- 
tion is the sum of a particular solution plus the general solution of the 
corresponding homogeneous equation. 

This important result is valid also for linear equations of higher order 
and is discussed later. In simple cases a particular solution can be found 
by inspection. For example, consider 


ad 


Y es 
dr +y=2 (5.34) 


We look for a particular solution of the form y = az + b. Then 
atar+b=z2 


holds if a+6=0, a =.1 80 that b = —1. Hence y=z-—1 is a 
particular solution of (5.34). This method of obtaining a particular 
solution is called the method of undetermined coefficients. The general 


solution of the corresponding homogeneous equation, ay +y = 0, is 


y = ce~*, so that 
y=ce*+zr—1 


is the general solution of (5.34). 
Problems 
d 
1. + rye 


dz 2 
a yzrewl-—y 


DIFFERENTIAL EQUATIONS 183 
dy 
—o=- —- cos IT = 0 
3 dz y 


4. Change wu + p(z)y = q(z)y” to a linear equation by the substitution z = y'-*, 


nx#1. Whatifn = 1? 
Review Problems 
1, 2 
ata z 2 pe 

2. i body falls because of gravity. There is a retarding force of trietion propor- 
tional to the speed of the body, say, kv. If the body starts from rest, find the distance 
fallen as a function of the time. 

8. (x + y)*dzx + 2% dy = 0 

4. {sin (ry) + zy cos (zy)] dz + z* cos (zy) dy = 0 

6. e#(Qry — Qrty?) dz + e~**(3x? — xty) dy = 0 

6. Brine containing 2 lb of salt per gallon runs at the rate of 3 gpm into a 10-gal 
tank initially filled and containing 15 lb of salt. The mixture is stirred uniformly and 
flows out at the same rate into another 10-gal tank initially filled with pure water. 
This mixture is also uniformly stirred and is emptied at the rate of 3 gpm __‘ Find the 
amount of brine in the second tank at any time ¢. 


§.5. An Existence Theorem. We consider the equation 
dy _ 
se = f(z, 9) (5.35) 


We wish to show that if f(z, y) is suitably restricted in a neighborhood 
of the point Po(zo, yo) there exists a unique function y = y(x) which 
satisfies (5.35) such that yo = y(zo). The restriction we impose on 
f(z, y) is the following: Assume that a constant M = o exists such that 
for all points P(z, y), Q(z, z) in some neighborhood of Po(zo, yo) we have 


\f(z, z) — f(z, y)| < Mz — y| (5.36) 
A function f(z, y) satisfying (5.36) is said to obey the Lipschitz condition. 


An immediate consequence of (5.36) is the following: If -, exists in a 


neighborhood of Po(zo0, yo), then *, is uniformly bounded in that neigh- 


borhood, for [from (5.36)] 
Se) = ii m (A? 2-fzZW| ey 
e—-y 
Conversely, if iy is continuous for a closed neighborhood of Po(zxo, ye), 


then f(z, y) satisfies the Lipschitz criterion for that neighborhood. 
Remember that a continuous function on a closed and bounded set is 
uniformly bounded over the set. We assume further that f(z, y) is 
continuous so that the integrals of (5.37) exist. 


184 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The proof of the existence of a solution of (5.35) is based upon Picard’s 
method of successive approximations. Define the sequence of functions 
yi(x), yo(x), . . - , Yn(Z), . . . 88 follows: 


yi(z) = mee yo) dx + Yo 
y2(r) = [Fie yi(r)) dx + Yo 
yale) = [* f(x, yala)) de + yo 


Geaea oR oe KR eh ae OAR Lids. ee Sect ee SE (5.37) 
Yn-(2) = [F fe, Yn -x(X)) dr + Yo 
Yr(r) = [Efe Yn—-(r)) dx + yo 
The y,(7), 7 = 1, 2,...,%n,..., are obtained in a very natural 


manner. In f(z, y) we replace y by the initial constant yo, integrate, and 
obtain yi(r) such that yi(z0) = yo. We now replace y in f(z, y) by yi (x), 
integrate, and obtain y2(r). This process is continued indefinitely. The 
next endeavor is to show that the sequence thus obtained converges to a 
function y(z) which we hope is the solution of (5.35). Let us note that. 


Yn(Z) = Yot (yi — yo) + (y2 — wi) $0 + Yn — Yu-1) 


Hence the investigation of the convergence of the sequence hinges on the 
convergence of the series 


Yo t+ (yi — Yo) + (Y2 — Yr) Hot + Yn — Yn) + °° (5.88) 


Let AK be an upper bound of f(z, y) in the neighborhood of Po(xo0, yo) 
for which the Lipschitz condition holds. Then from (5.37) 


lyi(x) — yol = fF se. Yo) dz | s | [°K dz} = K\z — xo| (5.39) 
Also from (5.37) 


Iyalz) — wa) = | fF, wd — fle, od] de! 
<| [2 Mlys — yol dz 





from the Lipschitz condition. Applying (5.39) yields 


yaa) — wala)| < MK; [|x — oj dr, 


, |x ans Xol? 
< MA py 


DIFFERENTIAL EQUATIONS 185 


Let the reader show that 


lya(x) — ye(z)| < KM? eae 


and by mathematical induction that 


lyn(t) — Yn-alx)| < KM" sel (5.40) 


Hence each term of the series (5.38) is bounded in absolute value by the 
terms of the converging series 


kK n Iz _ To” me k »M\zr—ro! 
w+) ha a ae 1) + yo 


n=] 


Since the series for e™'? 7 converges uniformly in any closed and 
bounded set, the series (5.38) converges uniformly for those x in the region 
for which the Lipschitz condition holds. We have 


lim yn(r) = y(x) 


n—> © 


and the convergence is uniform. From 


Yn(r) = [Ff Yn-1(2)) dr + Yo 


we have 


y(z) = lim y,(z) lim f f(x, Yn-s(r)) dr + yo 


n—-?> © n— @ 


i lim f(x, ye1(x)) dx + yo 


because of uniform convergence. The Lipschitz condition (5.36) also 
guarantees continuity of f(z, y) with respect to y, for 


lim f(z, 2) — f(z, w)| = 0 
since M|z — y| —-Oasz—y. Hence 

y(z) = [* fla, y(z)) dz + yo 
and Y= fe, ua) v(ae) = 


so that y(z) satisfies (5.35) and the initial condition. Q.E.D. 
We must now show that y(z) is unique. Let z(x) be a solution of (5.35) 
such that 2(z0) = yo. Then 


= f(a, 2(z)) (0) = yo 


186 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


and integration yields 
e(z) = [.* f(a, 2(2)) dx + yo 


Hence (2) — y(z) = f* (f(a, 2(2)) — fle, w@))] de 


; (5.41) 
Je(x) — ya) <M f* |e(@) — y@)| az 


by applying the Lipschitz condition. Let L be an upper bound of 
le(r) — y(x)|. Then 
lz(r) — y(xr)| < ML|x — Zo 


and, applying (5 41) successively, we obtain 


7 XL — Zoi? 
iz(x) — y(x)| < APL ae 
sn ro|* 

n! 


oe @ j@© @  @ ©  @® &  @©  #®  ®  &® 8  &  e® &® @ 


lz(r) — y(r)| < ML = 


M"\r Saale 


; : M*\r — xo" 
Since -—— - = is the nth term of " aoe J 


which is known to 
n=() 
converge, we have 
.  LM\r — rol" 
PUN 29". eseetnrs 
n—? @ . 


for any fixed xy. Hence the difference between z(z) and y(.r) can be made 
as small as we please. ‘This means that z(r) = y(r). Q.E.D. 


Problems 


. Show that f(z, v) = zy satisfies the Lipschitz condition for |z| < A < o. 
. Show that (5 40) holds. 
. Show that f(z, y) = 7 sin y does not satisfy the Lipschitz condition for all z. 


mm mr 


. Consider “ = y, y(0) = 1. Obtain the sequence (5.37), and show that 


lim ya(r) = e? 


n—?> @ 


8. Consider au =z+y3,y(0) = 0 Obtain yi(r), g(r), y:(7), ye(z) from (5.37). 
6. Consider the system 

dy 

dr fit We ®) 


de (5.42) 
rs g(x, y, 2) 


Impose suitable restrictions on f(z, y, z) and g(z, y, #) in a neighborhood of the point 
Pozo, yo, Zo), and obtain a sequence of functions y:(z7), yo(7), . . . , ya(Z), . . . , and 


DIFFERENTIAL EQUATIONS 187 
a sequence of functions 2;(z), z2(z), . . . , 2n(Z), . . . , such that lim y,(z) = y(z), 
m-> © 


lim z,a(z) = 2(z), where y(z) and z(z) satisfy (5.42). Show that y(z) and z(z) are 
unique if y(Zo) = Yo, 2(Zo) = Zo. 

5.6. Linear Dependence. The Wronskian. A system of functions 
yi(x), y2(z), . - . » Ya(2), @ Sz S Bb, are said to be linearly dependent 
over the interval (a, b) if there exists a set of constants ¢;, Cz, . . . , Cn, 
not all zero, such that 


> ay(z) = 0 (5.43) 
i=] 
for all z on (a, b). Otherwise we say that the y(xr) are lmearly inde- 
pendent. Equation (5.43) implies that at least one of the functions can 
be written as a linear combination of the others. Thus, if c, ¥ 0, then 


" 


1] 
y= - Ze » CY( a) 
2 


= 


Linear dependence will be important in the study of linear differential 
equations. Let us find a criterion for linear dependence. Assume the 
y.(z) of (5.43) differentiable n — 1 times. Then 


> aqy(r) =0 7 =0,1,2,...,n—1 (5.44) 


sx ] 


by successive differentiations. The system (5.44) may he looked upon as 
a system of n linear homogeneous equations in the unknowns ¢,, 2 = 1, 2, 

.,n. Since a nontrivial solution exists (remember not all c, vanish), 
we must have 


ly2’(x){ = 0 
or 
| y1(z) y(t) + yal) | 
W(yi, Yay ee Yn) = 1. ve . ne . . . oa | =@Q (5.45) 
| 
je ea. Ba ga 


Determinant (5.45) is a necessary condition that the y,(7) be linearly 
dependent on aS2zsb. This important determinant. is called the 
Wronskian of yi, yx, ..., Ye Tf Wy, ye, . .. , yn) # 0, the y, are 
linearly independent. 

Let us now investigate the converse. We take first an easy case. Let 


ronn=|l Bio 


188 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


for asazsb. Assume y;(x) #0 for asS2zxzb. The same result 
would be obtained if we assumed y2(z7) # 0 on a < x <b. Then 


yy, — yly2 = 0 Yi1Y> ? YoY; - 0 
1 


and £ @) = 0 s0 that y2 = cy, which shows that y; and y; are linearly 
1 


dependent. 

The proof for the genera] case is not so easy. We assume that (5.45) 
holds fora S$ x S$ b. Furthermore we assume continuity of the deriva- 
tives, along with the assumption that at least one of the minors of the last 
row of (5.45) does not vanish fora < xz = b. For convenience we assume 


ye) yal) + yal) 
moa) =| 22) WO) WK) | eg 
yz) yr PM@) + 9) 


fora Sx b. We now show that under these conditions the y, are 
linearly dependent over the rangea S$ x S$ b. First expand (5.45) about 
the first column to obtain 


pol(x)y\t-? (x) + pi(x)ys"-? (x) + + + pa-i(x)yi(z) = 0 


Hence y;(x) is a solution of the linear differential equation 








n—1 
aut = ar tets ee ae) 
Moreover, if we replace y:(r) by yo(x), y3(z) by y}(x), etc., in the first 
column of (5.46), the determinant vanishes since two columns are the 
same. Hence (5.46) is also satisfied by ye(x). With the same reasoning 
we find that y:(x), ye(x), ys(x), . . - , yn(z) are solutions of (5.46). The 
reader can verify easily that any linear combination of y1, yo, . . . , Yn 
is also a solution of (5.46). Now we fix our attention at a point Zo, 
as2x3 0. We have 


ya(Xo) y2(Xo) Agee Yu(2o) 
Y (Xo) Yo(Xo) -* < Yn (Xo) iia 0 
yy (00) os” (ao) 0 oy (a0) 


so that the system of homogeneous equations 


d cyyr(ee) = 0 j=0,1,2,...,n—-1 (5.47) 


t=] 


has a nontrivial solution in the c,. Consider a set of c; not all zero which 


DIFFERENTIAL EQUATIONS 189 


satisfies (5.47), and form 
y(t) = Y oy(z) (5.48) 


t=] 


Equation (5.48) is a solution of (5.46). oo y(to) = 0, y’(x0) = 

., yo) (20) = 0 as well as y'*-Y(ao) = 0. It will be os i 
Bee: 5.7 that Eq. (5.46) has a unique solution when n — 1 initial condi- 
tions are imposed. Now y(x) = 0 certainly satisfies (5.46) and the 
initial conditions y(x%o) = y’(%0) = y""(%o) = ++ = y%*— (29) = 0, so 
that y(z) of (5.48) is identically zero fora S$ x S$ b. Hence 


n 


2 ay.(z) = 0 


i=] 
forasxzsb. Q.E.D. 


Problems 


1. Show that sin z and cos z are linearly independent. 

2. Show that sin z, cos xz, e* are linearly dependent. Assuming e**/? = 7, show 
that e** = cops x +75I1n x. 

3. Consider y: = 24, y2 = 2\z|, —1 S21. Show that W(y:, yz) = 0 for 
—1 <2 #81. Does this imply that y: and ye are linearly dependent? Show that 
yi and y2 are not linearly dependent on the range —1 S$ x $1. Does this contradict 
the theorem derived above? 

4. Let y:(z) and ye2(z) be solutions of 


y’ + p(z)y’ +q(z)y = 0 


a Aa y2) 


fora szsxsb. Show that + p(x)W(y1, ye) = O and hence that 


Wyn, ys) = A exp [- f * p(e) dz] 


Show that if W = 0 for z = z) then W = 0 for all zona sz<sb. How does one 
determine the constant A? Show that if W # 0 for z = zo then W = 0 for all z on 
aszsb. 

5. Let y:(z), y2(x) be linearly independent solutions of 


y" + p(z)y = 0 (5.49) 


fora Sz sb. Let ys(x) be any solution of (5.49). Show that 
y3x(x) = ciyi(Z) + Coye(Z) 


Hint: Show that W(y, ys, ys) = Ofora Sz sb. 
6. Let yi(z), yo(z), . . . , yn(z) be linearly independent solutions (on the range 
aszsb)of 


<i 4. pi(z) oY - + pr(z)y = 0 (5.50) 


i= 


190 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
If y(z) is any solution of (5.50), show that 


n 


y(z) = > ceye (2) 
k=l 
fora sz sb. 


5.7. Linear Differential Equations. The differential equation 
a” 
Ts oa pilz) 5 dx™-! = + p2(x) = oy +++: + palz)y = p(x) (5.51) 


is called an nth-order linear differential equation. What can we say 
about a solution of (5.51) subject to the initial conditions y(20) = yo, 
y’(£0) = Yo. y!' (to) = Yo, - 5 YOP(to) = yf-P? Assume yi(x) a 
solution of (5.51). We note that (5.51) can be written as a system of n 
first-order equations by the simple device of introducing n — 1 new 





variables. Define yo, y3, . - . , Yn a8 follows: 
dy at 
ax 4? 
dys _ 
dx 
cages (5.52) 
dYn— 
AY n 
i —PilL)Yn — Pro(L)Yn-1 — °° * — Pa(X)yi + p(x) 


The last equation of the system (5.52) is (5.51) in terms of yi, y2,... , 
Yn Conversely, if [y:(x), yo(x), . . . , Yn(x)] is a Solution of (5.52), then 
y1(x) is easily seen to be a solution of (5.51). Now (5.52) is a special case 
of the very general system! 

dy" s 

ap TY, WP sa i Y?) +=1,2,...,n (5.53) 
As in Sec. 5.5 it can be shown that, if the f* satisfy the suitable Lipschitz 
conditions, there exists a unique solution y(x), y?(z), ..., y"(x) of 
(5.53) satisfying the initial conditions y*(ro) = yt, 7 = 1, 2,..., n. 
The Lipschitz condition for the f* is 


lf*(z, y’, y’, mt ire y”) = f(z, 2", 2 r ° 3 z*)| < M > ly ee 2*| (5.54) 


+=] 
for all z in a neighborhood of x = xo, M fixed. 


1 The exponents are superscripts, not powers. 


DIFFERENTIAL EQUATIONS 191 


If p(x), p(x),t = 1,2, ... , n of (5.52) are continuous in a neighbor- 
hood of x = Zo, then (5.54) is easily seen to hold. Hence (5.51) has a 
unique solution subject to the n initial conditions stated above. 

The reader can show easily that if y:(z), yo(z), . . . , yn(x) are linearly 
independent solutions of the — equation 





= o 4 pi(z) 4 "++ + pr(z)y = 0 (5.55) 


wa aa 
and if 7(x) is any particular solution of (5.51), then 


n 
y(z) = ) cay(e) + (2) (5.56) 
i=l 
is the most general solution of (5.51). Let the reader show that the 
c; are uniquely determined from the initial conditions on y and its first 
n — 1 derivatives at x = 2p. 

In general it is very difficult to find the solution to (5.51), and it is 
necessary at times to use infinite series in the attempt. This method 
will be discussed in a later paragraph. There is one case, however, 
which is the simplest by far. Naturally, one expects the difficulties in 
solving (5.51) to be alleviated to some extent if the p,(x) of (5.51) are 
constants. We study now this case. The homogeneous equation 

n—2 
ce oo po adz"-! oo Pe o¥ ++: + pray = 0 
Pp. = constant:7 = 1,2,...,n (5.57) 





can be solved as follows: Assume a solution of the form y = e”*. Sub- 
stituting into (5.57) yields 


e™=(m" + pym"—} + pom"? + o 2 + ey) = 0 
Hence if m is a root of the polynomial equation 

f(z) = ar + pz! + pe”? +--+ +p, =0 (5.58) 
then y = e™ is a solution of (5.57). Equation (5.58) is easily obtained 
from (5.57). One replaces a ; by 2, k = 0, 1, 2, , n. If the n 
roots of (5.58) are distinct, call them m,, m2, . . . , mn, then the general 
solution of (5.57) is 

y= cyem* (5.59) 
2, 


The reader can show that if the m are distinct, then 
Wem, em™ 2. . , emt) £0 


192 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


First let the reader show that 


1 1 1 
n my Mo eo e@ Mn 
W =exp (x ) me})|m> me +++ m? (5.60) 
ke] fle «© © «© © © © © © © © © © 
n—1 n—1 n—-1 
my, Mo Mm, 
Next let the reader notice that 
1 1 1 1 
Uu Mo M3 Mn 
F(u)=/|ue me m m> 
ur) omg omy ee me} 


is a polynomial of degreen — Linu. The polynomial equation F(u) = 0 
certainly vanishes for the n — 1 distinct roots u = me, m3, ... , Mn. 
If F(m,) were zero, then 


1 1 1 1 
mo mM 3 Ms Mr 
mm m3 m. |=0  #£Why? 
ms; mz? mr? 


Continuing with this line of reasoning, the reader can show that as a 
consequence of assuming W = 0 one finally obtains 


1 1 
Mn—1 Mn 


= 0 








a contradiction, since m, ~ my_1. 

The case for which some of the roots of (5.58) are equal must be treated 
separately. Before attacking this problem we shall find it beneficial to 
introduce the operator 


(5.61) 


s= 


als 


Tie iotationsice) Mente + Hay PGs Similarly: 
s*f(x) = s[sf(x)] = s{f'(x)] = f(x) 


and s* = * We define (s + a)f(x) = sf(x) + af(xz) = f’(x) + af(z) 


for any scalar a. Let the reader show that 


(s + a)[(s + 6)f(x)] = (s + B)[(s + a)f(z)] (5.62) 


DIFFERENTIAL EQUATIONS 193 
if a and b are constants. Equation (5.57) can be written 


(s* + pis?) + pos™-* + ++ * + pPa-i8 + Pa)y = 0 (5.63) 
or (s — m,)(s — me) ++ * (8 — ma)y = 0 (5.64) 
using the result of (5.62). Ifa is constant, we have 
s(es*y) = ersy + ae*y = e%(s + a)y 
Let the reader show by mathematical induction that 
sn(esty) = e%(s + a)"y (5.65) 
Now let us assume that (5.58) has the distinct roots m1, m2, m3, ... , 
m, of multiplicity a1, a2, ... , a, so that (5.64) can be written 
F(s)y = (8 — m)*(s — m2) + + + (Ss — m)*y =O (5.66) 
We note that, if y(x) satisfies (s — m,)“y(xr) = 0, then y(z) also satisfies 


(5.66). Now 
e~m=(s — my) *y(x) = sem" y(z)] 


from (5.65), so that any y(x) which satisfies (s — m,)y(z) = 0 also 
satisfies s*#(e—"**y) = 0, and conversely. Hence 


das 


da 


so that y(z) = eme7(C7, + Cor ts + Cg 127) (5.67) 


(e~“m*y) = 0 





It is easy to verify that (5.67) is a solution of (5.66). If a, = 1, then 
y(x) = Cye™*, Equation (5.67) contains a, constants of integration. 
Applying the same reasoning to the remaining roots yields n constants of 
integration. We omit the proof that the solutions thus obtained, 
namely, 


ems. rer ee: i 2 get tems ee zene. oak eee ee ene. rea, 


: oe Lemet 
are linearly independent. 


Example 5.8. The differential equation y’’ — y = 0 admits the solution y = e™? 
ifm? —1=0Qorm = +1. The most general solution is 


y = Cie? + Cre? 


If C: = §, C2 = —y, theny =sinhz. If Ci = C, = %, then y = cosh z. It is easy 
to verify that sinh z and cosh x are linearly independent. The most general solution 
can also be written in the form 


y = A, sinh z + A. cosh x 


Example 5.9. We wish to solve y”’ + n*y = 0 subject to the initial condition 
y(0) = 0, y’(0) =n, n real. e”* is a solution of y’’ + n*y = 0 if m* + n*® = 0 or 


194 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
m = +tn. Hence the solution is 
y = Aeriz + Be-m 
From (z = 0, y = 0) we have 0 = A + B, and from (z = 0, y’ = n) we have 


n = (An — Bn) 


or A — B = —7i- Hence A = 3? Be= -3 and y(z) == (e"*= — eZ) = gin nz. 
Let the reader show that y = A sin nz + B cos nz 18 also a general solution of 
y’ + nty = (). 

Example 5.10. Consider 


dty , dy _ ody . ,@y _ 
dr® * dr drt 7 dr? OSS) 


In operational form (5.68) becomes 
(s® + 84 — 68? + 48?)y = 0 


or s3(s — 1)2(s? + 23 + 4)y = 0. The zeros of f(m) = m2(m — 1)2(m? + 2m + 4) 
are m = 0,0, 1, 1, -1 + 737%. Hence the general solution of (5.68) is 
y = 6(A + Br) + e#(C + Dz) + E exp [(-1 + V/32)a] 
+ F exp [(—1 — V/3%)z] 
or y@ A+ Br+(C + Dajet + eG cos +32 + A sin 737) 


It has been explained before that the most general solution of an 
inhomogeneous equation can be written provided: 

(i) A particular solution is known. 

(ii) The general solution of the homogeneous equation is known. 

The second part has just been considered. It remains to study meth- 
ods by which a particular solution can be found. Here any guess can 
be made, and, if successful, no further justification is needed. Quite 
often a particular solution can be determined by inspection. The work 
involves finding certain undetermined coefficients. For example: 

(a) If p(x) of (5.57) is a polynomial, try a polynomial of the same 


degree. 
(b) If p(x) = ae”, try Aer. 
(c) If p(z) = asin wr + 0D cos wa, try A sin wr + B cos wx, 


A few examples should make clear what we mean. 


(a) y’ + y = 277. We assume a particular solution of the form y = a + bz + cz’, 
Then y’ = 6 + 2cz, y’’ = 2c so that of necessity 2c + a+ bx + cr? = 27? and 
cm2, 2c+a=0,b=0. Hence y = Asinz + Bcosz + 2r? — 4 is the general 
solution of y’’ + y = 2r?. 

(b) y”’ — y = xz — 2e2*, Assume a particular solution of the form 


y =a + bz + ce* 


Then y’ = b + 2ce**, y’’ = 4ce?*, and of necessity 4ce?* — a — bx — ce** =x — Qe. 
This yields 3c = —2,a = 0,b = —1. The general solution is 


y = Aet + Be-* — x — $e” 


DIFFERENTIAL EQUATIONS 195 


(c) y’ + 4y’ — 2y = 2sinz — 3e7 +1 —2zr. Wetry a particular solution in the 
form y =asinzx +bcoszx+ce*+d-+fzx. Of necessity 
—asinz — bcosxz + ce* + 4acosz — 4b sin x + 4ce* + 4f — 2asinz 

— 2b cos x — 2ce* — 2d — 2fx = 2sin sr — 3e7 + 1 — Qr 

Equating coefficients of sin xz, cos xz, e*, etc., yields —3a — 4b = 2, —3b + 4a = 0, 
3c = —3, 4f —2d = 1, —2f = —2, so that a = —g5, b = —as, c = —1, d =F, 
f =1. The reader can verify that y = —s°; sin z — gs cosz —e? +H +2 18 8 
particular solution of (c). 


Although in many cases a particular solution is easily obtained by 
the method of undetermined coefficients, it is important to have a formula 
valid in all cases. The problem is to find a particular solution of the 
equation 


d” d"y qd7- ly dy 
aah +a ae OSS Sed tee 17 on Any = p(x) (5.69) 


Let g(x) be a solution of the homogeneous equation 


d"y dr- ly 


dx” a1 dx—! 


+ * - te An iz a+ AnY = (5.70) 





which satisfies the initial conditions 
G0) = 9 O) Sgr Oy tee ge AO) OD geen) = I 
We prove that 
y(t) = fy oe — t)p(t) dt (5.71) 


is a particular solution of (5.69). Differentiating (5.71) twice yields 
/ 0 oe 
y'(x) = g(a — x)p(2) + I 20 = 9 pe at 
gs | dg(x — t) p(t) dé 
0 





Ox 


y'(a) = WED) ney + [MER wey at 


_ {| eg@—F 
= [ 503 p(t) dt 





since g’(0) = 0. Further differentiation yields 
7 a"-lo(x — t) 
(n—1) = pee eek 
yo? (2) [ LE pit) at 
or? —t gr 
yor(ay = EHO) pay + [POE 9 ey a 


= p(x) + [° mote — 9 p(t) dt 





i=z 


196 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


since g"-(0) = 1. If these values are substituted into (5.69), one 
obtains 


y(x) + ayy"? (@) + + + + + any(2) 

_ "| a"g(x — bt) a" 19(x — t) 8 _ 

= p(x) + ik ee + a oa r + ang(x 0 | p(t) dt 
= p(x) 

The expression 


0"g(x — ft) 


o"—19(a — ¢) 
Ox” 


+a +--+: + ang(x — 2) (5.72) 
vanishes since g(x) satisfies (5.70). The derivatives of g(x) in (5.72) are 
evaluated at x — ¢t, but since g(x) satisfies (5.70) for all x we certainly 
know that g(x) satisfies (5.70) at xz — ¢t. We have shown that (5.71) isa 
particular solution of (5.69); the a, 7=1, 2,..., , of (5.69) are 
constants. 


Example 5.11. Consider y’’ — y =sinz. We first solve y’’ — y = 0, which 
yields the solution g(r) = Ae* + Be~*. We now impose the initial conditions 
g(0) = 0, g'/(0) = 1. This yields A+B =0,A —B=1,s0that A = 4, B= —4. 
Thus g(z) = x(e* —e-*) = sinh z. A particular solution of y’’ — y = sin z is 


z 1 . 
Yr = 5 x(ezt — e(2-) gin ¢t dt 


e* [7 e* [z, . 
5 fo etemtdt — 5 fe sin tt 


1 i—z 


—s sin x + te — Ze 


I 


The complete solution is 


Aet + Be? — = sin x + ye* — fe? 
Aye? + Bye? — Fin x 


Y 


Example 5.12. y'’’ — 3y’"’ + 4y’ — 2y = e* sec x. The solution of 
get 22 By" + Ay’ ee Qy = () 


1s [as 


y = g(x) = Ae* + e7(B sin x + C cos zx) 
If g(0) = 0, g’(O) = 0, g’’(0) = 1, we must have 


A+C=0 
A+C+B=0 
A+2B =1 


which yields B = 0, A = 1,C = —1. Thus 
Yr) = [ : [e™-* — e7-' cos (x — ft)Je* sec t dt 


=e [sec tat — e f° (cos x + sin z tan ¢) dt 


=z e* In (sec + tan x) — zve* cos x + e? sin x In cop z 


DIFFERENTIAL EQUATIONS 197 
The general solution is 
y=e(A + Bsinz + Ccosz + In (sec x + tan x) — x cos x + sin z In O08 @] 


Example 5.13. Method of Variation of Parameters. Equation (5.71) is a valid 
particular solution of (5.69) for constant coefficients. We look for a particular solu- 
tion of 





Se + pile) Sat + pale) Fok + + + ley = PCO) 


To simplify things, we shall discuss the general third-order linear differential equation 


EY + p(x) £4 + pole) © + pslzdy = pl) (5.73) 


Let yi(x), y2(x), ys(x) be linearly independent solutions of the homogeneous equation 
derived from (5.73). 
3 
We know that y = > A.y, 1s the most general solution of the homogeneous equa- 
t=] 

tion. The method of variation of parameters consists in attempting to find a par- 
ticular solution of the inhomogeneous equation by varying the A.,,7 = 1, 2, 3, that is, 
by assuming that the A,,7 = 1, 2, 3, are not constant. Assume 








Y = UYi + U2Y2 + UsYs (5.74) 
We shall impose three conditions on ae, 1 = 1,2,3. Two of these conditions will be 
: d . du; d 
din 8 dus dy: _ 9! 
ae 0 We ae 0 (5.75) 
t=1 +=] 
Differentiating (5.74) and making use of (5.75) yields 
d . d 
Yr 
au ae ) u (5.76) 
1=]1 
Differentiating again and making use of (5.75) yields 
d? : d? 
oy » u, Se (5.77) 
t=] 
Finally we have 
3 
d3y d®y, du, d*y, 
dzt ~ > “Get + 2, de da? eae) 
11 


We multiply (5.77) by pi(x), (5.76) by pe(x), (5.74) by ps(x) and add these results to 
(5.78). If y(x) is to be a solution of (5.73), of necessity 


dus dys 
dx dx? 


| 


= p(z) (5.79) 





198 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Equations (5.75) and (5.79) are three linear equations in the unknowns Gus + = 1, 2,3. 








: ax 
- U1. 
Solving for a ields 
5 Ue 
0 4, Uy, 
du; _ | P) Ye Ys | _ po) Wy», ys) (5 80) 
dz | Yi Ye Y3 Wy, Y2, ys) 
4 42 Ys, 
Vy Yo Ys 








The Wronskian W(y, y2, ys) does not vanish since 1, Ye, ys are linearly independent. 
Hence 
* p(x)W (ye, ys) 
u(x) = I de 
A) = Ja Wyn va, va) 


and in general 


w(x) = i Pe tie bess Hass) dx (5.81) 
where ya(z) = yi(x), ys(x) = yo(x). The reader can verify that 
3 
PO) Y yl) W nad yorald) 
2 Mas ate eS one cee a ee 
ya) / Wud, nd, no) © (5.82) 


is a particular solution of (5.73). 

Example 5.14. We consider y’’ + (1/r)y’ — (1/z?)y = x. It is easy to verify that 
yi(z) = 2, yo(x) = 1/xare linearly independent solutions of y” + (1/z)y’ — (1/z?)y = 0. 
We have 

















7 0 y2(z) | x 
ui(z) = Pia) _ya(z) | ro —a(l/e) dz _ 2 
yi(z)  ya(a) ‘es 1 4 
" (x) Yo(x) 0 x 
i oe 
“ x? 
tiyi(z) 0 
u(r) = ee eet 
0 


A particular solution is y = (x*/4)(z) + (—2‘/8)(1/z) = x*/8. The complete solu- 
Ss fet Bz 
tion is y = Az te ee 

Problems 


1. Ly’’ + Ry’ + (1/C)y = 0, L, R, C constants. Discuss the cases R? — 


2y’ —y=2 
8.0!" — 2y” + y = 5 sin 2x 
4.y"—y x=zt+il . 


> 
= 0. 


DIFFERENTIAL EQUATIONS 199 


6. y'" +4’ =2sinz 

6. y'’ — 4y’ + Sy = e** + 4 sin z 

7. If y: is a particular solution of y"’ + p(x)y’ + q(xz)y = ri(x) and y2 is a particular 
solution of y’’ + p(x)y’ + q(z)y = r2(x), show that y; + y2 18 a particular solution of 
y” + p(z)y’ + q(z)y = ri(z) + r2(z). 

8. yy’ +y = tang 

9. y!’ — 2y’ + y = cos* x 

10. y’’ — 4y’ + 4y = 2ze* 

11. Solve y/"” — y” =1+2 +8sin z by the variation-of-parameter method. 


5.8. Properties of Second-order Linear Differential Equations. We 
consider 


d? d 
aaa + P(x) = + a(a)y = 0 (5.83) 


(a) If p(x) and q(x) are continuous ona S z S BD, there exists a unique 

solution y(z) of (5.83) subject to the initial conditions y(x)}= yo, 
2 

y' (20) = Yo) @ S Xo Sb. Moreover oy is continuous since ay exists. 
We now show that if y(z) is a solution of (5.83) then y(z) cannot have an 
infinite number of zeros on the interval a S$ zx S bunless y(z) = 0. The 
proof is as follows: Assume y(z) vanishes infinitely often on the interval 
a<xxsz#zb. From the Weierstrass-Bolzano theorem there exists a limit 


point c,a Sc <b. We can pick out a subsequence x, 2%, ..., Zn, 


. which converges to c such that y(z,) = 0,n = 1,2,.... From 
the theorem of the mean y(z,) — y(@n-1) = (Ln — Tn-1)y’(En) 80 that 
y' (én) =0,%n-1 S bn SAn,n =1,2,.... Hencey’(c) = lim y’(&) = 0 


since the é, also approach xz = ¢ as a limit and since y’(z) is continuous 
at x =c. Moreover y(c) = lim f(z,) = 0. But y(z) =0 satisfies 


(5.83), and y(c) = 0, y’/(c) = 0. Since y(x) is unique, we must have 
y(z) =0. Q.E.D. 

(b) Let yi(z) and y2(x) be linearly independent solutions of (5.83). 
Let yi(e:) = yi(ce) = O and yi(r1) ¥ 0 for aXe <xr<e.S bd. We 
show that y2(c) = 0, c1 < ¢ < ce, that is, the zeros of y:(x) and y2(x) 
separate each other. Assume y2(xz) # O force, < x < ce. Then 


= yi(z) 
g(x) = a) 


has no singularities for c, S$ « S$ co. Why is it true that y(c,) ~ 0, 
Yo(co) % 0? Moreover g(c1) = ¢g(c2) = 0. From the theorem of the 
mean ¢’(£) = Ofore; S— Sco. But 
oie) = YU) — Bu") _ Wys ye) ban 

43(&) - -YR(E) 








200 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Since y:(x) and yo(z) are linearly independent, we know that W(y1, y2) ¥ 0 
fora S$x3b. Hence ¢y’(é) cannot be zero, acontradiction. ye(r) must 
be zero for some z,¢; <2 < ce. Q.E.D. 

(c) We can write (5.83) in the form 


£ | Ka) | + Q(xr)y = 0 | (5.84) 


We multiply (5.83) by exp ( i: * v(t) at) and write 


£ | exp Ge p(t) dt) HT + q(x) exp ([? p(t) dt) y=0 


so that K(x) = exp [’ p(t) dt, Q(x) = q(x) exp [ p(t) dt. 
An important differential equation is the Sturm-Liouville equation 


- | Ke) “4 + Ag(x)y = 0 (5.85) 


where \ isa parameter. Assume K(a) = K(b) = 0, and let y;(x, \1) bea 
solution of (5.85) for \ = Ax, ye(x, Ax) a solution of (5.85) for X = dg. 
We show that 


[ Q(x) yi(X, A1)yY2(z, Ae) dx = 0 At ¥ deo (5.86) 
We have 


ol Ke) ou: + A19(z)yi = 0 
: : (5.87) 
a | Ke) ay: + rog(z) yo = 0 


Multiply the first equation .of (5.87) by yz and the second by y:, and 
subtract. This yields 


21 Kw (1% OU Yi. aus) + (Ai — Az)g(x)yrye = 0 (5.88) 


Equation (5.86) follows if we integrate (5.88) over the range a < x S b. 
The orthogonality property (5.86) will be discussed in greater detail in 
Chap. 6 dealing with orthogonal polynomials. 

(d) Let y:(z) be a solution of y” + Gi(x)y = 0, and let y2(z) be a solu- 
tion of y’’ + Go(xz)y = 0. Assume further that yi(a) = y2(a) = a > 0, 
yi(a) = y;(a) = B, Gi < G2. We show that ys(x) vanishes before y:(z) 
vanishes, r >a. The proof is as follows: We have y!’ + G,(z)yi = 0, 


DIFFERENTIAL EQUATIONS 201 
yo + Go(x)y2 = 0 so that 
yoy’ — yiy2’ + (Gi — Gr)yry2 = 
¢ (yoyi — Yi¥s) + Gi — Ge)yiy2 = 


c d : ; c 
I — (YX, — YiYo) dx = [ (G2 — Gi)yi(x)y2(x) dz 


x 


YoY, — YY 





= [ (Ge a: Gi)y1Yye dx 


ye(c)ys(c) — yilc)y2(c) — aB + a8 = [ (Go — Gi)yiy2 dx (5.89) 


lA 
& 
A 
° 


Let c be the first zero of y:(x), c > a, and assume y2(z) ¥ Ofora S$ 
From (5.89) and yi(c) = 0 we have 


y2(c)y(c) = ie (Ge — Gi)yiy2 dx (5.90) 


Now G.(z) — Gi(x) > Ofora S$ x Sc, yi(xz) > Ofora S$ x<c, y2(r) > 0 


forasxzsec. Hence y2(c)yj(c) > 0 from (5.90), so that yi(c) > 0. 
But 


yt (c) a lim yi(x) a y1(c) = lim yi(x) < 0 
2c t—C rae t — C 
r<e r<e 


since yi(x) > Oandz <c. Thisisacontradiction. Q.E.D. Thesame 
result would have been obtained if y:(a) = yo(a) = a < 0. 
Let the reader show that, if yi(c:) = 0, yi(cz) = 0, y:(x) # O for 
C1 <x < c, then y2(x) = O for some z onc; Sz S Ce. 
Problems 
1. Consider y’’ + p(z)y’ + q(xz)y = 0. Let y = we, and determine u(z) so that the 
resulting second-order differential equation in v(x) does not contain . 


2. Consider Legendre’s equation 
d?y dy 
— p2) 2 — —— = 
(1 — 2?) 7x? 22 °F. +n(n+1)y =0 (5.91) 
Let yi(z, a), y2(z, 8) be solutions of (5.91) forn = a and n = 8,a # 8. Show that 
1 
| _ Yi, a)y2(z, B) dz = 0 


8. Prove the second statement of (d). 
4. Give an example of (b). 


5.9. Differential Equations in the Complex Domain. Up to the present 
we have discussed differential equations from a real-variable view. 
Greater insight into the solutions of differential equations can be obtained 


202 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


if we study the differential equations from a complex-variable point of 
view. The reader may recall that the Taylor-series expansion of 


about z = 0 converges only for |z{| < 1. In a way this is puzzling since 
1/(1 + 2?) has no singularities on the real axis. However, the analytic 
continuation of f(x), namely, f(z) = 1/(1 + 2?), 1s known to have poles 
at z= +7. The distance of z = 7 and z = —17 from z = 01s 1, so that 
the Taylor-series expansion of 1/(1 + 2?) about z = 0 has a radius of 
convergence equal to unity. 

The simplest first-order linear differential equation may be considered 
without difficulty. If p(z) is analytic at z = 20, the solution of 


dw 
ay + p(z)w = 0 (5.92) 


is W(z) = Wo exp | — i p(t) dt | and w(z) is analytic at z = Zo. 
The first non trivial case is the linear-second order differential equation 


d*w 
dz? 


This equation is most important to the physicist and engineer. More- 
over, the methods used for solving it apply equally well to higher-order 
equations. In attempting to solve (5.93) we must obviously consider 
the coefficients p(z) and g(z). Let z = zo be a point such that p(z) and 
q(z) are analytic at z = zo. Remember this means that p(z) and q(z) 
are differentiable in some neighborhood of z = zo. <A point zo of this 
type is said to be an ordinary point of (5.93). Let & be the smaller of the 
two radii of convergence of the series expansions of p(z) and q(z) about 
z = zo. We shall now prove Theorem 5.2. 

THEOREM 5.2. For any two complex numbers wo, wo there exists a 
unique function w(z) satisfying (5.93) such that w(zo) = wo, w’(zo) = 
Moreover w(z) is analytic for |z — z0| < R. 

We begin the proof by removing the first-order ie ha 7 (5.93). 


Let w = uv, u and v undefined as yet. Then o =U 2 + S 


p(z) © oe ~ + q(z)w = (5.93) 


d?w du dv d*u 
ae ~ ait 2a dz +” aes 


Substituting into (5.93) yields 


d*v du 
uf + (mu +28) % +(Se +e pat + qu)v =o 


DIFFERENTIAL EQUATIONS 203 


If we set pu + 2 ae = 0, we see that 
u(z) = exp [ -3 [ n(s) at] 


Thus the substitution w = 1 exp | —% ip * p(t) dt| reduces (5.93) to 


d*y 


= | e f 
sa + Jer = 0 (5.94) 





Let the reader show that /(z) is analytic for |z — 20| < R. Moreover 
the reader can also show that if v(z) is a solution of (5.94) then 


w=? exp | -4 [ro at| 


is a solution of (5.93). Next we attempt to reduce (5.94) to an integral 
equation. Assume o(z) satisfies (5.94) such that v(z0) = vo, v’ (zo) = U%. 


Then 
z d2y z 
: - dz = — [ J (z)0(z) dz 
Zz 
or - —-y=- [ J (r)o(r) ar 


Integrating again yields 


viz) — v0 — (2 nae Zo) dk es [ i J (7)o(7) dr dt 
— [Fes ae 


where (ff) = [ : J(r)v(r) dr. Note that ¢(zo) = 0. We now integrate 
by parts and obtain 


(2) — v0 — no(z — 20) = — sot) + [* 9) oh) de 
and v(z) = v9 + Volz — zo) + [ ; (¢ — 2)d(E)oly) at (5.95) 





The reader can show that if v(z) satisfies (5.95) then v(z) also satisfies 
(5.94). Equation (5.95) is a Volterra integral equation of the first kind. 
It is a special case of the integral equation 


o(2) = AG) + [* kz, s)n(s) as (5.96) 


To find a solution of (5.95), we apply the method of successive approx- 
imations due to Picard (see Sec. 5.5). We define the sequence » (2), 


204 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


v:(z), ve(z), . . . , Un(z), . . 


vo(z) = vo + U9(z — Zo) 


. a8 follows: 


vi(2) = vo + v6(z — 20) + [* (6 — 2d (E)v0(5) at 
va(z) = vo + v4(z — 20) + f* ( — 2)I@)uxlE) de (5.97) 


Now 


e 8 e@® j@ @ @ @© @ @6®©  e@ @& @  @® @ #© e# $@ @ 


Ungi(Z) = vo(z) + [vi(z) — vo(z)] + [ve(z) — v1(z)] + °°: 


+ [Ungi(Z) — valz)] 


so that the convergence of the sequence {v,(z)} hinges on the convergence 


of the series vo(z) + > [vp4i(z) — ve(z)]. We can write 


k=0 


mesr(e) — vel) = f* & — 2S (oe) — ma dy — (6.98) 


Fia. 51 


We choose as our path of integration 
the straight line joining z to z so 
that ¢ =2z+it(z—2),0StS 1, 
with [—-Z = t(z = 20) and 


dt = (z — 20) at 


Let C be the circle of analyticity 
of J(z) with center at Zo, and let z 
be an interior point. Weconstruct 
a circle S with center at zo interior 
to C and containing the given 
point z in its interior (see Fig. 5.1). 
Since J(¢) and ¢ — z are analytic 
inside and on 8S, J(¢)(¢ — z) is 
bounded by some constant M, that 


is, |J(¢)(¢ — z)| < M for allz, fin and on S. Hence 


los(e) — vole)l = | f° © — eT @)o0(G) at | 


< uM { ldg| = pM \z — 20] 


where y» is any upper bound of vo(z) in and on S. Similarly 


DIFFERENTIAL EQUATIONS 205 





lve(z) — v1(z)| = [ (¢ — 2)J($)[01(F) — vo(t)] dt 


< uM? | It — zol fae 


Zo? 


< pM2\2 — zat [ tdt = ume %— 20 25 


By mathematical induction the reader can show that 





| ) | < Mn! lz a zo(**? 5 99) 
Ve+1(Z) — ve(2) Le (n+! (5. 
Thus the series representing vn4:(z) is bounded by the series 
41 
M*\z — 2o|* 
k! 
k=O 
7 ‘ é = k Pk ; 
which in turn is bounded by the series of constant terms pu 2 ZY Since 


k=0 
R > |z — 2o|. We know that the latter series converges to pe”? so 
that from the Weierstrass M test the series representing v,+;(z) converges 
uniformly. Since each term of the series representing v,,1(z) isanalytic, 
Un+1(Z) converges to an analytic function v(z) (see Prob. 7, Sec. 4.9). We 
now show that the limiting function, v(z), satisfies (5.95). From (5.97) 


lim vm4i(2) = vo + v(2 — 20) + lim [* & — 2)J(S)0n(S) a 
n(2) = vo + v6(2 — 2) + f° lim (f — 2)J (Eval) do 
= vo + vg(2 — 20) + f° (¢ — 2) u(t) a 


It is possible to take the limit process inside the integral because of the 
uniform convergence of v,(z) to v(z). 

Next we prove that v(z) is unique. Let u(z) also satisfy (5.95). It is 
easy to verify that r(z) = v(z) — u(z) satisfies 


r(z) = [* & — IQs) de (5.100) 


Since u(z) and v(z) are analytic in and on S, the function r(z) is bounded 
by some constant K, |r(¢)| < K. From (5.100) we have 


Ir(z)| < MK [ * |dt| = MKlz — 20 


206 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Applying this inequality to (5.100) yields 


Ire) < Mf" Ir(e)| las] < eK [|g — 20) lat 


|z — 20]? 
< MK > 


Continuing this process yields 


, & — 20” 


for all integers n. Since lim Sel = 0, we can make |r(z)| 


nrn— © 


as small as we please. This is possible only if r(z) = 0.so that u(z) = v(z). 
Q.E.D. 
Example 5.15. We consider w” — zw’ — w = 0. Certainly z = 0 is an ordinary 


point. An analytic solution is known to exist. The easiest way to find this solution 


is to let w = cn2z”. The c, are to be determined by the condition that w satisfy 


n=(Q) 
w’ —zw'’ ~w=0. We have 


a eo 
wu’ = NCn2"—* wl = > n(n — 1)cp2"-? 
n=} n=2 
oe & 0 
so that > n(n — 1)caz"-? — > NCnZ” — » Cn2"” = 0 
n==2 n=l n= 


A power series in z can be identically zero only if the coefficients of 2", n = 0, 1, 2, 
.., vanish. Hence 


(n + 2)(n + lenge — (n + 1)cn = 0 n=0Q,1,2,... 


We obtain the recursion formula 


~ 


Cn 
Ona 4D n=0,1,2,... 


We note that cz = co/2, ca = €2/4 = €9/(2-4) = cyo/2?2!, 


ed 
“= 6 (2-4-6) ~ 233! 


 » Con = Co/2"n!. Also c3; = 1/3, Go = 3/5 = e)/(1 -3- 5); 


sige OO ces aoe a 
7 (obey) 


- » Cong = €1/(2n + 1)!!, where 


(2n + 1)!! = (2n + 1)(2n — 1)(2n — 3) --- 5-3-1 


DIFFERENTIAL EQUATIONS 207 


The constants co and ¢; are arbitrary constants of integration. The general solution 
of w’ — zw’ — w = 01s 


gentl 
ae 2 anni + B = (Qn + Dit 


nx QO 


Problems 


1. Show that, if v(z) satisfies (5.95), then v(zo) = vo, v’(zo) = U5. Also show that 
v(z) satisfies (5.94) if v(z) satisfies (5.95). 

2. Prove (5.99) by mathematical induction. 

8. Solve w’’ — z3w’ + zw = 0 for w(z) in a Taylor series about z = 0. 

4. Solve w”’ + 1/(1 — z)w’ + w = 0in a Taylor series about z = 0. 


5.10. Singular Points. Let z = a be an isolated singularity of either 
p(z) or g(z) in (5.93), and surround z = a by a circle, C, of radius p, 
such that p(z) and q(z) are analytic for 0 < |z — a| < p. Now let 20 be 
an ordinary point of (5.93), 0 < |zo — al < p (see Fig. 5.2). From 
Sec. 5.9 there exists a solution of (5.93) which is analytic in a neighbor- 
hood of zo. Call the solution w,(z, 20). The Taylor-series expansion of 
W1(z) = wi(z, 20) about z = 29 converges up to the nearest singularity of 
p(z) and qg(z), which may be the point z = a or a point on the circle C. 
In Fig. 5.2 C’ is the circle of convergence. Now let I be any simple 
closed curve through zo lying inside C and surrounding z = a (see Fig. 5.2). 
We choose a point z; on I, 2; inside C’. Since wi(z, zo) is defined at 2, 
we can compute wWi(21, 20), W;(21, 20). Since z; is an ordinary point of 
(5.93), there exists a unique function w,,(z) such that wj:(z) satisfies 
(5.93) and wiy(21) = wi(z1, 20), Wyy(Z1) = wy(41, 20). Wis(z) is an analytic 
continuation of w;(z, 20). Its domain 
of definition is the interior of the circle 
C” (see Fig. 5.2). This process of 
analytic continuation can be contin- 
ued, and the reader can show that 
in a finite number of such continua- 
tions we can reach zp. Let wf(z) be 
the analytic function which is the ana- 
lytic continuation of w;(z), both w;(z) 
and w¥(z) analytic in a neighborhood 
of z =z. We write w¥(z) = Aw,(z) 
and look upon A as an operator associ- 
ated with analytic continuation. We 
leave it as an exercise for the reader to Fic. 5.2 
show that, if w1(z) and w2(z) are lin- 
early independent solutions of (5.93) for some neighborhood of z = 2p, 
then Aw;(z) and Aw.(z) are also linearly independent. 

Since z = a is a singular point, we cannot expect necessarily that 





208 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Aw,(z) = wi(z). Let us now attempt to find a solution w(z) of (5.93) 


such that 
Aw(z) = Aw(z) (5.101) 


If wi(z) and w.(z) are linearly independent solutions of (5.93) for a 
neighborhood of z = zo, we have 


w(z) = CW (2) + CoW2(2) 
Aw,(z) = 011; (2) + Qi2W2(z) 
Awa(2) = deyw;(2) + desto(2) otge) 
Aw(z) = C,Aw,(z2) + CoAW2(z) 


The last equation of (5.102) implies that Acyw; = c,Aw, and that 
A(w, + W2) = Aw, +. AW. 


In other words, A is a linear operator. Let the reader deduce this fact. 
From (5.101) and (5.102) 


C1(411W1 + Q12W2) + Co(Go1W) + de2We) = A(c1Wy + CoW2) 


which in turn implies 


Ci(Qi1 — A) + Code; = O 

C1012 + Co(Aee = dr) = Q (5.103) 
A nontrivial solution in ¢), cz exists if and only if 

‘wok (5.104) 


Q12 deo — XA 


Equation (5.104) is a quadratic equation in \, so possesses two distinct 


roots or two equal roots. 
Case 1. Let ¢:(z), v2(z) be the functions of (5.101) corresponding to 


the distinct roots Ai, Ae, that is, 
Ag;-= VN, 1=1,2 (5.105) 


It is an easy task to show that ¢; and ¢» are linearly independent. 
Let us note the following: Let z — a = |z — ale‘ and 


(2 — a)* = |z — altei#? 
The analytic continuation of (zg — a)* around I increases 6 by 27 so that 
A(z — a)* = e?ra(z — a) 
We choose a;,7 = 1, 2, so that e?***i = d;,7 = 1,2. Hence 


A(z —a)% =Aj(z2-—a)% g=1,2 (5.106) 


DIFFERENTIAL EQUATIONS 209 
Combining (5.105) and (5.106) yields 
Al(z — a)~-™9,(z)] = A(z — a)~%Ag;(z) Why? 
1 
= > — a)rAyy,(2) 

J 
= (2 —a)-9(z) g=i1,2 (5.107) 
Thus the functions F,(z) = (2 — a)-%¢,(z), 7 = 1, 2, are single-valued 


inside C' with a possible singularity at z = a. From Laurent’s expan- 
sion theorem we have 


oo 


(2 —a)i(2) = ) bale — a)” 


(2 —a)-*pr(2) =) en(z — a)” 

or gi(z) = (2 — a) >, ba(z — a)” 
pene (5.108) 

px(z) = (2— a) Yale — a)" 


Thus both ¢;(z) and g2(z) have, in general, a branch point and an essential 
singularity at 2 = a. 

Case 2. The reader is referred to MacRobert, ‘Functions of a Com- 
plex Variable,” if A; = Ae. In this case 


y(z) = (¢ — a)2 | wie) + wale) In (2 — a) | (5.109) 
where w,(z) and we(z) are analytic at z = a. 


An important case occurs when the functions ¢:(z), ¢ga(z) of (5.108) 
have no essential singularities. In this case we can write 


gi(z) = (@—a)Pyj(z) g=1,2 (5.110) 


where y,(z) is analytic at 2 = a; y,;(a) #0. Now 


¢;' + p(z)¢; + g(z)oi = 0 (5.111) 
gy’ + p(z)e, + g(z)e2 = 0 
so that gop,’ — give’ + p(z)(y291 — vi¢2) = 0 
d ; : 
Thus a (p29) — 91%.) = —p(2)(G2¢1 — ¥1%9) 
.112 
1 dW (¢1, ¢2) Otte) 


and Pe ey) de 


210 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Making use of (5.110) to (5.112), let the reader show that 


c_ * 
pz) = >= + p*@) aaa 


q(2) = oy + —s + 9*(2) 








with p*(z), g*(z) analytic at 2 = a. The same result occurs for ¢(z) of 
(5.109). We notice that p(z) of (5.113) has at most a simple pole at 
z = a and q(z) has at most a double pole at z = a. This suggests the 
following definition: 

DEFINITION 5.1. z = 20 1s said to be a regular singular point of 


w"’ + p(z)w’ + g(z)w = 0 (5.114) 
if: 

(i) 29 is not an ordinary point. 

(11) (2 — 20)p(z) is analytic at z = 20. 

(ili) (2 — 20)?@(z) is analytic at z = Zo. 

We now attempt to find a solution of (5.114) in the neighborhood of a 
regular singular point. To simplify matters, we assume that 2) = 0. 
The transformation z’ = z — go transfers the singularity to the origin. 
Let w(z) = z*u(z), a an unknown constant, u(z) undefined as yet. For 
w = 2%u we have w’ = z%u'(z) + az*—!u, 


w= 2%u!’ + 2azeu! + ala — 1)22-%u 
If w(z) satisfies (5.114), then u(z) must satisfy 
z°u!’ + [Zaz + p(z)z*]u’ + [a(a — 1) + azp(z) + 22¢(z)Ju =O (5.115) 


Now zp(2) = po + pz + poe? +: 
2’q(2) = got giz t+ que t+: 


Equation (5.115) becomes . 


zu’'(z) + P(z)u’(z) + Q(z)u(z) = 0 (5.116) 
provided a is chosen as a root of the indicial equation 
ala — 1) tap tq =0 (5.117) 


with = P(z) = 2a + zp(z) = 2a + po + piz + pe? + °° 
Q(z) = a(pi + poz + se) + ai + geez + oe 


Conversely, if u(z) satisfies (5.116), then w(z) = z*u(z) will satisfy (5.114) 
provided a satisfies (5.117). The existence of a solution of (5.114) hinges 
on the existence of a solution of (5.116). If u(z) satisfies (5.116), 


u(0) = Uo 


DIFFERENTIAL EQUATIONS 211 


then a double integration of (5.116) yields 
zu(z) = [P(O) — 1}uoe + [ [2 — P(r) + (rt — 2)(Q(r) — P'(z))Ju(z) dr 


and u(z) = [P(O) — 1]uo + . [ B(z, r)u(r) dr 20 (5.118) 
with Biz, 7) = 2 — P(r) + (rt — 2)[Q(7) — P’(r)] 


We leave it to the reader to show that, if u(z) satisfies (5.118), then u(z) 
satisfies (5.116). The reader can also show that if we define u(0) = uo 
then u(z) of (5.118) is analytic at z = 0 so that (5.118) holds for z = 0. 
To do this, the reader must show that lim u(z) = Uo. 


Now P(0) = 2a + po and a; + a2 = 1 — poif a; and ae are the roots 
of (5.117). If ai = ae, then P(0) = 1 so that (5.118) becomes 


u(z) = Af B(z, r)u(r) dr (5.119) 


If we attempt to prove the existence of a solution of (5.119) by Picard’s 
method of successive approximations, we might expect trouble at z = 0. 
Let our first approximation be uo(z) = wo, and define 


uj(z) = +f B(z, 1) uo dr z #0 


Let the reader show that lim w;(z) = wo, so that, if we define u:(0) = wo, 
u,(z) becomes continuous and, indeed, analytic at z = 0. Continuing 
this process yields 


Un(z) = Af Biz, t)Un—i(r) dr (5.120) 


with un(0) = uw, m = 0, 1, 2,.... We define u,(0) = wo, and the 

reader can show that u,(z) is analytic at z = 0. Since P(O) = 1, we can 

write P(r) = 1 + 7f(r) and Biz, r) = 1 — rf(r) + (7 — 2)[Q(7) — P'(r)]. 

We can certainly pick a small circle, C’, with center at z = 0 such that 
“|B(z, r)| < R < 2 for all z, 7 inside and on the circle. Moreover 


|us(z) — Uo(z)| = 








if {1 — P(r) + (7 — 2)[Q() — P’(r)]} dr 





= +f {—af(r) + (7 — 2)[Q(7) — P'(r)]} dr 


If we let r = tz, OS ¢ S 1, ¢ the variable of integration, it is easy to 
see that 
|us(z) — wo(z)| < Ale| A = constant 


for all z inside C. 


212 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Since lwe(z) — wi(z)| = 


sf B(z, r)[ui(r) — uo(r)] dr 





we have 


|ue(z) — ui(z)| < ar | |r| |dr| = AR 








1 
2 
Continuing, we obtain 


|un(Z) — Un—s(z)| < A (Ey \2| 


Since the series A |z| > (R/2)"-! converges uniformly inside and on C, 
n=1 

remember Rk < 2, the sequence {u,(z)} converges uniformly to an analytic 

function u(z). As was done in Sec. 5.9 for z = zo an ordinary point, it 

can be shown that u(z) satisfies (5.116) and (5.115). It is then trivial 

to show that w(z) = z%u(z) satisfies (5.114). 


Example 5.16. zw’ + w' —w =0 or w” + (1/z)w’ — (1/z)w = 0. In this case 
p(z) = 1/z, g(z) = —1/z and zp(z) = 1, 2%¢(z) = —z so that z =0 is a regular 
singular point. We know that a solution exists in the form w = z®u(z), where a 
satisfies a quadratic equation and u(z) is analytic at z = 0. One of the simplest ways 
of determining u(z) is to assume u(z) = > Cre", w(z) = > Cnet", The series for 

n=0 n=0 
w(z) is substituted into zw’’ + w’ — w = 0, and the coefficient of each power of z is 
equated to zero. More specifically, we have 


oo eo 


w'(z) = > Cn(n + a)grte-] w''(2) rs > Cn(n + a)(n + Bs L)gnte-2 


n=0 n=0 


and hence w(z) = > Cr2z"t® satisfies zw’’ + w’ — w = 0 if and only if 


n=0 
0 


> Ci(n ta)(n ta — 1)gnte-d + Cr(n + a)gnterl — > Cyzrt? = Q (5.121) 
=0 n=0 


n=Q n 


The lowest power of z which occurs in (5.121) is 2%-!. The coefficient of z%~! 18 
coa(a ~ 1) + cox = Cola(a — 1) + a]. If (5.121) is to be satisfied, we must have 


Cola(a — 1) +a] = 0 
Cy is to be arbitrary so that a must satisfy the quadratic (indicial) equation 


alfa —1)+a=0 


DIFFERENTIAL EQUATIONS 213 


ora? = 0. The roots of the indicial equation area = 0,0. The coefficient of z**¢—! 
in (5.121) is 


Cr(n + a)(n +a — 1) +en(n +a) — Car n=1,2,3,... 
If (5.121) is to be satisfied, we must have 
Cnn ta)(n ta —1) +e.(n +a) — Ca = 0 





= gi Cael a 
or Cn Gea n=1,2,3,... (5.122) 
Equation (5.122) is the recursion formula for thec,,n = 1,2, .... Sincea = 0, we 
have 

Cn-1 
fn = n=1,2,... 
Co Ci Co c? Co Co 

so that = ie C2 = 52 ~ 72. Oe C3 = 33 ~ 729932 ~ (312 


and by induction ¢n = ¢o/(n!)?.. Hence a solution of zw’’ + w’ — w = Ois 


w(z) =A > Tape =A 
n=O 


We postpone the discussion for finding a second independent solution of 


zw’ +w’ —w =0 





2 
The solution w(z) = A > ini): converges for 0 S |z| < ©. This is exactly the 
n=O 
region of analyticity of zp(z) = 1, 2’q(z) = —z. 


The reader can find a proof in Copson, ‘‘Theory of Functions of a 
Complex Variable,” that, if zp(z) and 2’g(z) are analytic for |z| < R, 


then w’’ + p(z)w’ + q(z)w = 0 has a solution w(z) = > c,z"t* and the 


n=O 
eo 


series > Cri2z" converges at least for |z| < R. The roots of the indicial 
n=0 
equation need not be equal. 
Example 5.17. zw’ + (2 —1)w’ + w = 0. The reader can easily note that z = 0 


is a regular singular point. Let w(z) = > Cn2"**, so that 


n=O 


w'(z) = > Cr(n + a)erte-] 


n= 
eo 


w''(z) = > Cri(n + a)(n +a — 1)z*t**-2, Substituting into 


n=0 


214 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


ew’ + (zg —1)w’ + w = 0 


yields 
» cn(n t+ a)(n ta — 2)e*te-1 + > Ca(n +a + lyet* =0 (5.123) 
n=Q n=Q 


The smallest power of z occurring in (5.123) is z2*~1, whose coefficient must be equated 


to zero. Thus 
Coal(a sos 2) = 0 


and a satisfies the indicial equation a(a — 2) = 0, so that a = 0, 2. We shall see 
that the larger of these two roots produces no difficulty, the smaller of these two roots 
does produce a difficulty. 

The coefficient of 2"*7-',n = 1,2, .. . , in (5.123) must he set equal to zero. This 


leads to the recursion formula 


Cn(n +a)(n +a — 2) + epil(n +a) = 0 


2 1S, gee = ; 
or Cn an So n=1,2,3,... 
If we try a = 0, we have ¢, = —cn-1/(n — 2) and cz becomes meaningless. We try 
a = 2 and obtain ¢, = —¢n-1/n, 80 that cy = —¢o, C2 = —01/2 = ¢/2, 
ete es ee 
vee 3! 


cy = —C3/4 = co/4!, and by induction c, = (—1)"(co/n!). Hence 


ae 
w(z) = Az? > (-1)°% = Aze-s 
n=0 


is a solution of zw” + (z — 1)w’ + w = 0. 


It is not very difficult to see that a second solution cannot be obtained 
by the method of series solution used in Examples 5.16 and 5.17 when 
the roots of the indicial equation differ by an integer. When the roots 
do differ by an integer the larger of the two roots presents no difficulty. 


Let w(z) = > Cr2"t= be a solution of (5.114). Then 


n=0 
w' (z) = > Cr(n — a)grte} w''(z) = > C,(n + a) (n +a-— Lette? 
n=0@ n=0 


and substituting into (5.114) yields 


> c,(n + a)(n + anv ])g"te-2 + p(z) > c,(n + a)grte-l 
n=0 n=O 


+ q(z) > c,znte = 0 (5.124) 


n=(0) 


DIFFERENTIAL EQUATIONS 215 
Since zp(z) = po + piz + po? + °°: ,229(z) =Qtgzetget--:, 


we have 


[cn(n + a)(n +a— 1) +¢,(n + a)(po + pz + °° *) 


0 


1h44s 


+ ¢n(go + uz + * * *)jente-? = 0 
Of necessity 


ala — 1) + apo + qo = 0 indicial equation 
The coefficient of z*+*-? must be zero, which implies 
Cnl(n + a)(n +a — 1) + (n a a)po + qo] 
=— ) al(s+a)Pne+ Qn] 21 (6.125) 
s=0 


Equation (5.125), the recursion formula for the c,, will determine ¢,, 
n=1,2,... , in terms of co unless 


Fia,n)=(n+t+a)\(n+ta-—1)+ (n+ a)po+ q =90 
=(ntal(nta—1l+po) +q = 0 


for some integer n. If a, a; are the roots of the indicial equation, a 2 ay, 
we have a + a; = 1 — po, aa; = Qo, SO that 


F(a, n) = (n+ a)(n — a) + aay 
= n(n +a — a) 


a — a, = negative integer. To obtain a second solution when the roots 

of the indicial equation differ by an integer, we turn to the method of 
Frobenius. 

Method of Frobenius. We use (5.125) to determine ci, ¢2, ... , Cn, 

. In terms of co anda. The specific value of a is not inserted. The 

series 


Since a 2 a, F(a, n) = Oforn 21. The only difficulty would occur if 


oo 


w(z, a) = > Ca(cx)ante (5.126) 


n=Q@Q 


determined from (5.125) does not satisfy (5.114) but satisfies 


0*w(z, 2) i. ute, a) a) 


az? + p(z) ——— + az) wlz, a) 


= Co(a ad a1) (a — a2)Zz*? (5.127) 


where a; and az are the roots of the indicial equation. The c,(a) of 
(5.126) are well defined for the larger of the two roots, say, a1. Now let 


216 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


= A(a — ae) so that (5.127) becomes 


— ow, @) =) + 9(2)w(e, a) 


= A(a — a;)(a — ae)*z*-* (5.128) 


De) 


The change of co to A(a — az) applies also to the coefficients ¢1, cz, . . . 
since these coefficients depend on co. If ai = ae, it is not necessary to 
replace co by A(a — a2). We differentiate (5.128) with respect to a. 
Since a and z are independent variables, we have 


0? f dw 0 [dw ow 
552 (2) + plz) ae (22) eg (2") 


= 2A(a — a;)(a — az)e*-* + A(a — a)? ~ [(a — ay)z7-?] 


Evaluating at a = ae yields 


e[(t2)__]+no &[(2)_] +00 (@2)_-0 ar» 


Equation (5.129) shows that 3) is a solution of (5.114). 


Az 9 
Example 5.18. Returning to Example 5.16, we have 


Co 


sO pte Oey 
(2+a)?~ (+a)X(2+a)"?*°*’ 


o—_ Co mew 
a d+a)? 7 ~ 
Me Oat 
"“ (U +a)2X(2+a)?...n+a) 
so that 
_ zZ 22 
wz, x) = 2% E + qs t Tr HOTS eres + 
~ Zz” 
TU Faye tat @tat | | 


To compute (=) » we need to compute 
Oa A= 


later wre 
Oa | (1 + a@)2(2 +a)? --+- (n+a)? 
Let y = (1 + a)72(2 + a)? +--+ (n + @)~? so that 


Tl Tr 
iny= -2 ) Ink +a) and me l 
k=1 e k=1 


n 
Hence Fe 4 = —2Qyamd > bi. gests 
aan k 
k=1 k=1 








DIFFERENTIAL EQUATIONS 217 


n 
where F(n) = de Thus 


ie wa[ne Ye?) BB er 


n=1 








The second solution of zw’’ + w’ — w = 01s 


w(z) = B EE y aii —2 y an | 
n =0 n= 


Example 5.19. Referring to Example 5.17, we have 
Co sr ey aig ee he eee ee Git, 
a—1 a  (a—l)e atl (a — l)a(a +1)’ : 
(—1)*co mee 
(a — lhafa +1) -°+:+ (a +n — 2) 


We replace co by Aa and obtain 





Cn = 


2 23 


a eng ee Se et aie ee et 
BG) “ne |< am CF Gees eo NG a1), 


1 


nh Sees ‘aS, wes e e e 
i ee hat he payee ea —o). | 


To compute 3) » we need to compute 
Ja CL wma () 


Se ecersiee.. #0 
dal@—-Ne@th@t2)---@tn—2)lao ” 
Let y = (a — 1I)T(a tl) (a+2)4+>+ (a +n — 2)7}, so that 
n—2 


In y = — In (a — 1) - Pe 


oy ee 
a= eas yi)- wm 5} ne 
) = (0. Hence 
a = () 
‘ = ge 4 2 
(= oe) jt Aine(-#4+2-F 45 ----) 
n~2 
1)" Ds 
ave [rte—es ) SY 


The second solution of zw’’ + (¢ — 1) w’ +w=0is 


w(z) = B| eine —~l—z+22 — 2 > ae | 
n=2 


n 


— 
=> 
3 
I 
HW 
ren 
@ 
ele 


where F(n) = (—1)" ’ 


k=2 


218 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problems 


1. Derive (5.113). 

2. Show that if w(z) satisfies (5.118) then w(z) satisfies (5.116). 
8. Derive (5.125). 

4. Solve zw” — w = 0. 


Ans. w;(z) = A data 


= grt 22 2 1 
w2(z) = B | In z > mak ~*~ Gm (i +3) 
n «() 


2 
- sm (F gee or) oe | 


§. Solve 2?(1 — z)w” + 2(1 — 3z)w’ — w = 0 for w(z) in the neighborhood of 
z= (0. 

6. Solve z(1 — z)w” — (1 + 2)w’ + w = 0 for w(z) in the neighborhood of z = 0. 

7. Solve 4z2w’’ + 4zw’ — (22 + 1)w = 0 for w(z) in the neighborhood of z = 0. 


§.11. The Point at Infinity and the Hypergeometric Function. To 
determine whether the point at infinity is an ordinary or regular singular 
point, we let z2 = 1/t and investigate the point t = 0. We have 


dw _dudt_ __dw 
dz  dtdz dt 
aw ,,d*w , aw 
de ap ta 


so that w” + p(z)w’ + g(z)w = 0 becomes 


2 saci 
= 2t pe 6 w+ hy 26 (5.130) 


Hence z = © is an ordinary point if 


(i) a= tl) is analytic at? = 0 


1/1 . (5.131) 
(il) ad (7) is analytic ati = 0 


For this case a solution can be found in the form 
w(z) = ao +3 +3 ~ +: 


Again z = © is a regular singular point if z = © is not an ordinary 
point and if 


DIFFERENTIAL EQUATIONS 219 


(iii) 2 = Pl is analytic ati = 0 


; (5.132) 
*) is analytic att = 0 


This implies 


p(t) = pe pit + pat + aes 
(i) 








t 
1 2 
pd\7) = d+ at + gel? + 
so that pey = ++. + Bete, 
gz) = 24+ 5+ ates + 5+ a 


If, moreover, we desire that the origin also be a regular singular point 
we must have p(z) = po/z, g(z) = qo/2?. 


Example 5.20. We look for the differential equation (of second order) which has 
but one regular singular point at z = 0. In this case z = © is an ordinary point so 
that 


2 = BOD = py + pit + pot? + Ses 
ia (7) = Go + git + gol? + - 
and epe) = 2-P Ba... 


2 z? 


(2) = ++. .- 


In order that z = 0 be a regular singular point, we must have 
Po= Pi = °:: Par = - =O BM =+'': =GQ=-+:- #0 
Hence q(z) = 0, and p(z) = 2/z, so that 


w" 42 ap = 0 
Zz 


' We consider now the following problem: We look for the differential 
equation which has exactly three regular singular points at z = 0, 1, ©; 
all other points are to be ordinary points. Since p(z) has at most simple 
poles at z = 0, z = 1, we know that z(1 — z)p(z) is an entire function. 
Hence 

2(1 — z)p(z) = > Anz” (5.133) 


n=2Q@ 


for all z. Since z = © is to be a regular singular point, condition (iii) 
must be upheld. From (5.133) 


220 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


f? n(1/t)” 
(!) 2, anlt/e 
P\G) eT 
t ) aa(1/t)” 
and (8) = 
t* \t fi 


must be analytic att = 0. This is possible if and only if a, = 0, > 1. 








Thus 2(1 — z)p(z) = ao + az 
and ple) =< +3 a - (5.134) 
With the same type of reasoning the reader can show that 
Biz — ze 
q(z) = Ge (5.135) 


Let the roots of the indicial equation at z = © be a =a, a = b. 
We also impose the condition that the indicial equations at z = 0 and 
z = 1 have at least one root equal to zero. Now at z = 0 we have 
po = im zp(z) = ec from (5.134). Also go = lim z*g(z) = pvB. The 

z— 2 


indicial equation at z = 0 is 

ala —1)+ca+prB=0 
If a = 0 is to be a root of this quadratic, we must have wvB = 0 so that 
we pick » = 0 if we desire g(z) # 0. The other rootisa =1—ce. Let 


the reader show that, if one of the roots of the indicial equation at z = 1 
is zero, the other is a = 1+ A and 


q(z) = woh (5.136) 


At z= o we have 


2t— p(i/t)  2t—-e—tA/t—1) 2-c- A/(t—1) 
t 


RN eS SAR A SN 


and po = limt *—~ BO) 9 e+ A 
t-0 
Similarly go = B, and the indicial equation at z = © is 


afa—-1)+(2-—c+A)a+B=0 
or ef+1-c+AjatB=0 


If the roots of this equation are a and b, we must have 


a+tb=-l+c-—A 


DIFFERENTIAL EQUATIONS 221 


ab = B,sothat -A =a+b+1-—c. Our differential equation is 


d?w ec, l-~c+a-+b)\dw ab 7 
oe + (S+ z—1 )\e +e yw =0 Oden 
Equation (5.137) is the famous hypergeometric differential equation. 
Its solution is written in the form (after Paperitz) 


0 oo 1 
w(z) =P 0 a 0 Z (5.138) 
l1—ce b c-—a-—b 


The top row denotes the regular singular points, and the other rows 
contain the roots of the indicial equations at each singular point. Note 
that the sum of the roots is unity. 

The operator 0 = z = due to Boole is very useful in solving (5.137) 
by series methods. The reader can show that © has the following 
properties, 

O(cig¢i + Cog2) = C1091 + C28¢2 
Oz* = az 
O(In z)* = Kk(In z)*“} 
O*(z*o(z)) = 2418 + a)" 


(5.139) 


where 0*9 = 0(O¢), etc., and (6 + a)"o = 2 “) a’—"O"¢. 
r=0 
The reader can show that (5.137) can be written in the form 


e(6 +c — lbw = 2(60 + a)(O + b)w (5.140) 
We solve (5.137) or (5.140) by series. Let 


@ 
k=0 


a 


Then Ow = c,Ozkte = c(k + a)gtte 
2, pa 
(0 + byw = ) ox(k +a + bjzt 
k=0 
6(8 + b)w = > en(k + a + b)(k + a)ztte 


k=0 


© 


> alk fa + b)(k +a t+ a)zette! (5,141) 


k=0 


z(6 + a)(O + b)w 


222 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
a 


Also O(0+c—1)w= > o(k +a(k+a+te—1)ett (5.142) 


k=0 
Equating (5.141) and (5.142), we obtain the recursion formula 
Cai(kK tat l(kt+t+atec) =a(ktat+ay(k+a+ bd) 
along with the indicial equation 


ala+e-1)=0 
For a = 0 we have 


_Gktak+h), 








Che = a Dio” = 0, 1, 2, 
so that Cy = io Co 
o = GENO TY _ (at Ialb + Ib eo 
: 2(e + 1) (c + 1)c 2! 


and, in general, 


_ oS 


= (ctn—1): n! 
eol'(c) Tia + n)T(b jaa 








T(ia)T(b) eetn) nr! 
_ Pat mr +n) 1 
= Tic + n) ni 
where I'(a) is the gamma, or factorial, function of Chap. 4. A formal 
solution of (5.137) is 


a D(a + n)P(6 + n) 2" 
wz) = A > eae a (5.143) 
n=Q 
a, b, c cannot be negative integers. Why? 
We define the hypergeometric function F(a, b; c; z) by the equation 


T'(c) . Qa + n)P(b + n) 2” 


Fare 4 Te+n) ni —«(5-144) 


F(a, b3 ce; 2) = 
Problems 


1. Solve zw’’ + 2w’ = 0 for w(z) in the neighborhood of z = 0. 
2. Derive (5.135). 
8. Let w(z) be a solution of Legendre’s differential equation 


(1 — 2) SP — 222 + nin + Iw = 0 


DIFFERENTIAL EQUATIONS 223 


1 —-l 00 
wiz) = P10 O n+l 2z 
0 0 —n 


Show that 


4. Verify (5.139). 
6. If c is not an integer, show that a second solution of (5.137) is 


- Tnt+l+a-—ornt+1lt+b—oe 


we Ses Tin +2 -— cc) n! 


n «0 
6. What is the interval of convergence of (5.143)? 
7. Show that 
(1 — 2)-* = Fla, B; B; 2) 
= 2F(1, 1; 2; 2) 





In oe 
by expanding the left-hand sides in Taylor-series expansions. 
8. Show that when c = 1 the second independent solution of (5.137) is 


Tr(a + r)T(b +17) of 


T'(a)P(b)F(a, b3 1; 2) Inzg + > Sr (rl? 


r=] 





rT 
heres = Y (“po ; 4-3) 
Re Ee atnal b+tn—l n 


r= 


5.12. The Confluence of Singularities. Laguerre Polynomials. In 
the discussion of the hypergeometric function and its associated differ- 
ential equation the regular singular points 0, 1, © were distinct. It 
turns out to be useful to consider what happens when two or more 
regular singular points approach coincidence, a process usually referred 
to as the ‘“‘confluence”’ of singularities. The reader is referred to Chap. 
XX of Ince’s text on differential equations for more details and for a very 
useful classification of linear second-order differential equations accord- 
ing to the number, nature, and genesis of their singularities by confluence. 
We consider Example 5.21 by way of illustration. 


Example 5.21. Kummer’s Confluent Hypergeometric Function. Consider the hyper- 
geometric function 


w(z) = F(a, b; ¢; 2) 


a(a+1)--:-(atnr—1)bib+1)---(b+n-1)2 
: ec+1):--(+n-—-1) n! (5.145) 


for |z| < 1, which satisfies 


z(1 — z)w’’ + [c — (a +b + 1)z]w’ — abw = 0 (5.146) 


224 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Now let z’ = bz, and then replace z’ by z. We obtain 


oF’ (<, b;c; ) 


Yo +) + @tn- OHH +. bHn-Ye 
= ora (c+n — 1)” nl (5.147) 


which converges for |z| <b. oF 1 satisfies 
a€ - 7) w" + (c -2+74+1,) w’' —aw = 0 (5.148) 


If we let b—> &, we obtain (formally) the series 


oO 


is cake Sy ade aa+1)-+:(+tn—1)” 
W'1(a; c; 2) = y SEI eae al (5.149) 


r= 


which certainly converges for |2| <1. Moreover it can be shown that 1/1(a; c; z) 
satisfies 
zw’ + (c — z)w’ — aw = 0 (5.150) 


obtained formally by letting b—> ©. Equation (5.150) is Kummer’s confluent hyper- 
geometric equation. It has a regular singular point at z = 0, but z = © is an 
irregular point, that is, z = © is neither an ordinary point nor a regular singular point. 
If c is not an integer, the other solution of (5.150) is 


2-- iFi(a —c +1;2 — 32) (5.151) 


If c is a negative integer or zero, (5.149) becomes meaningless, while if c is a positive 
integer greater than 1, (5.151) becomes meaningless. If c = 1, both solutions coin- 
cide. The second solution can then be obtained by the method of Frobenius. 


We now consider the function ,F,(a;c + 1;z2). The series representa- 
tion of :F:(a; c + 1; 2) will terminate if a is a negative integer, —m, for 
the coefficient of z”*! is 


(—m)(—m + 17+ ++ (-m+m+1-—-—1) 
(c+ 1)(e+2)--:+ (c+n)nI 


Similarly higher powers of z have zero coefhcients. Hence ,Fi(—n,c + 1; 
z) is a polynomial of degree n if n is a positive integer. We define 


= 0 


L® (2) = (e+ D(et+2)- +: C+n) Fi(—n;e +132) (5.152) 


n! 


as the associated Laguerre polynomial, n a positive integer. The reader 
can show that 


the °) aL (2) dL 2 


+ (e+1—2)—4~ + nLe(z) = 0 (5.153) 


DIFFERENTIAL EQUATIONS 225 


We show now that we can write 


L©(z) = " (¢-tgnte) (5.154) 


os 


Let v(z) = e772"** so that 2 = e~|(m + c)erte-! — 2+] and 


dv 
ea = (n te — 2) 


Differentiating n -+ 1 times with respect to z yields 


qe 
or + (m +1) 3 = (n +e — 2) oS - +1) 





_ dv 

dz” 
LEC a tere 5.155) 

Zoe Z a n u= (5. 


& (e-*z"*°). If n is a positive integer, 





Now let w(z) = e?2~°u(z) = e27° 
it is easy to see that w(z) is a polynomial of degree n. From 
u(z) = w(z)e*z° 


and the fact that u(z) satisfies (5.155) let the reader show that w/(z) 
satisfies 


d*w dw 
taa tle t+l—~aa t+ nw =0 (5.156) 
Since (5.156) is exactly the same as (5.153), we must have 
L©(z) = Kw(z) = Kez-¢ S (e-zg" +e) 


‘Let z = 0. From (5.152) the reader can easily show that 
(es ile 2) a (en) 





L9(0) = 2 
Let the reader show also that 
i & (e—#g" Fe) = (c+1)(e+2):-+ (c+7n) 


so that K = = 


oat This proves (5.154). 


226 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The associated Laguerre polynomials have the interesting and impor- 
tant property 


: * e-*eeLO(2)LO(z) dz =O men (5.157) 
To prove (5.157), we need only show that 
[ : ere" L(x) dx = 0 form <n (5.158) 


If (5.158) holds, then (5.157) is valid, for L(x) is a polynomial of degree 
m < nand (5.157) is simply a linear sum of integrals of the type (5.158). 
Equation (5.158) can be obtained by integrating by parts m times after 
replacing L(x) in (5.158) by (5.154). 
Problems 

1. Show that 1F1(a; c; z) satisfies (5.150). 

2. Show that L“(z) satisfies (5.153). 

8. Show that L© (0) = (e+e +2) +--+ ( +7), 


n! 
. Prove (5.158). 
. Show that 


a > 


rie +n + 1) 


. 2c J (c) (c) ae 
I, ea L) (2) L)° (x) dz Ta +) 


© 
Hint: Evaluate ip errr" L© (x) dx. 


6. Show that 
nLi(z) = (Qn +e —1 - 2)L®,(z) — (n +e — 1I)L©, (2) 
dL f(z) 
oe nL©(z) — (n + c)L&, (2) n22 
z 
t 
J L(x) dr = LO) — L,(@) 


L is called the ordinary Laguerre polynomial. 


nr 
7. Show that L(2) = > (*) ee 129 
r= 


r! 


§.13. Laplace’s Equation. We discuss now a method for solving 
Laplace’s equation. This equation is of some importance in mathe- 
matical physics. We consider V?V = 0 in rectangular coordinates, 


eV , eV , eV 


Let us look for a solution in the form 


V(z, y, 2) = X(x)Y(y)Z(z) (5.160) 


DIFFERENTIAL EQUATIONS 227 


This attempt at finding a solution of (5.159) is called the method of 
separation of variables. Applying (5.160) to (5.159) yields 


PZ 
oF + ax a + XY SS =0 


For V # 0 we have upon division by V 


lax 1a ae 
X dx? | Y dy? ° Z dz? 
1d?x 1dad@Y , 1dZ 
Thus — ¥ Ga = 7 ay + 3a (5.161) 


The equality in (5.161) cannot hold unless both sides of (5.161) are con- 


stant, for if u(x) = v(y, z), then ou = 0 and u = constant. Hence 


2 
es -o5 = constant = +k? 
2 
= ae + k2X = 0 (5.162) 


The solutions of (5.162) are 
X = A; cos kx + B, sin kz 


or X = A,ek* + Bye-* (5.163) 
or X=Azrc+B fk = 0 
Similarly 
oF A 0 
dy” 
and Y = C,cos ly + Di sin ly 
or Y = Cre’ + Dew’ (5.164) 
or Y=Cy+D if] = 
Finally 
2 
Eee = (+k? + I’) 
sa 4 
or 5+ (FR FP)Z = 0 (5.165) 


The solution of (5.165) depends on the magnitudes of k? and /? and the 
signs preceding k? and 1’. 

The choice of the sign preceding the constants depends on the nature 
of the physical problem. We illustrate with Example 5.22. 


Example 5.22. For steady-state heat flow we have V?7' = 0. We consider a two- 
dimensional semi-infinite slab of width a. The edge given by z = 0,0 Sy < o& is 
kept at constant temperature 7 = 0, as is the edge given by t =a, 0 Sy < o. 


228 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The base of the slab given by y = 0,0 S x S ais kept at a steady temperature given 
by T = f(z). For the steady-state case we have 


If we let T = X(x)Y(y), we obtain as above 


1 d2X 1 a@?Y 
— X dz? = Y dy? = constant 
If we choose our constant to be negative, we have X = A,e** + Bye~**. Our bound- 
ary condition necessitates, however, that 7 = 0 when x = O and when z = a for all y. 
This cannot be achieved for X = A,e** + B,e~** unless Ay = B, = 0. However, if 
we choose 

_ 1X _ pp 

X dx? 


then X = A, cos kx + B, sin kx. Since sin (nrz/a) vanishes for x = 0 and x = a 
provided n is an integer, it seems proper to consider 


X = B,sin 
a 
where k = nr/a. It follows that 
a?Y = n*®x? 
Ge a > 


so that Y = Cre(r/ov + D,e~*/au, We do not expect the temperature to become 
infinite as y becomes infinite; so we choose C, = 0. 


_ Nee 
B,Dne~“"*/™" sin —— 
a 


Hence T(z, y) 


Ane "/ov gin a (5.166) 


where n is an integer. The final boundary condition is T(z, 0) = f(z). Equation 
(5.166) cannot, in general, satisfy this boundary condition, unless, by choice, we pick 
(x) = Asin (nrz/a). Now V?T = 0 isa linear partial differential equation, and it is 
an easy thing to show that 

T(x, y) = > Ane-Or/0v gin MEE (5.167) 


n=0 


is also a solution of V?7' = 0 provided the series converges and can be differentiated 
twice term by term. To satisfy the boundary condition T(z, 0) = f(x), we need 


oe > A, sin ™? (5.168) 
n=O 
Can the infinite set of constants An,n = 0,1,2, ... , be found so that (5.168) holds? 


This is the subject of Fourier series and will be discussed in Chap. 6. 


DIFFERENTIAL EQUATIONS 229 


Problem 1. Find a solution of V?V = 0 such that V = Oforz = 0,2 =a,V =0 
fory =0,y = b, V = Oforz = +o. 
Problem 2. By the method of separation of variables find a solution of 


av , av au 


ax? | ay? = hoy 
Assume V(z, y, t) = X ee 
a2V 1 a?V 
Problem 3. Do the same for = — rs Oy Saas 


We find now a solution of V?V = 0 in spherical coordinates by use of 
the method of separation of variables. In spherical coordinates Laplace’s 
equation is 


av av 1 o0?V 
—f %2 eb cit aks £m 
sin ar 2 (. av 5 (e sin 6 ar | sin 6 ag? 0 


Assuming V(r, 0, y) = R(r)0(6)®(¢), we obtain upon division by V 





sin? 0d / dR sin 6 d dO _ 1d 
Let the reader show that of necessity 
1 d’} 
= ® aoe constant 


Now physically we expect and desire that 
Vir, 6, ¢) = V(r, 6, ¢ + 2m) 


Let the reader show that this implies that the constant be chosen as the 
square of an integer, that is, 


wa + n’*b = 0 
Thus @ = A, cos ng + B, sin ng. Equation (5.169) now becomes 
ld/(.,dk\ _ 1 4d de n? 
Rar (- ar ~~ @ gin 6 dd (sin : #) - sin? 6 10) 


and this implies that 


l1df dR 
R aC af) = constant = k 
or r ce 2 2r a - —kk =0 (5.171) 


To solve (5.171), we assume a solution in the form R(r) = r™. This 
yields r™[m(m — 1) + 2m — k] = 0. Hence r™ is a solution of (5.171) 


230 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


provided m(m +1) =k. Similarly r-*” is a solution of (5.171) for 
k = m(m-+ 1). The most general solution of (5.171) for k = m(m + 1) 


is 
R(r) = Cmr™ + Dyr~ et (5.172) 


We need not have guessed at a solution of (5.171), for it is immediately 
obvious that r = 0 is a regular singular point of (5.171). The series 
method of Sec. 5.10 could be used to obtain (5.172). Equation (5.170) 
now becomes 


sin 0 5 (sin 6 #) + [m(m + 1) sin? é— n27J0 =0 ~~ (5.173) 


If we let wu = cos 0, du = — sin 6 d8, (5.173) can be written 


(1 — 22) + ja = 2 | + [m(m + 1)(1 — x) — n3J0 = 0 


d’?0 de n? 
os 2 —_—— oR — — am 
or (1 — p?) Ta? Qu Fe + | mcm + 1) i =| 8=0 (5.174) 





Equation (5.174) is called the associated Legendre differential equation. 
Its solution in Paperitz’s notation is | 


= oo 1 

Ou) =P, nm m+1 gn | (5.175) 
—n —m = —4n 

Problem 4. Deduce (5.175). 

Problem 5. If V is independent of ¢, that is, n = 0, (5.174) becomes 


2p dP 
(1 — p?) a _ eu a +m(m+1)P =0 (5.176) 


with @ replaced by P. For m an integer show that the solution of (5.176) in the 
neighborhood of » = 0, written P,,(u), is 8 polynomial of degree m. Also show that 


1 
[, Pm(e) Pr(u) du = 0 form #n 


The Pn(u), m = 0, 1, 2, ... , are called Legendre polynomials. We shall have a 
great deal more to say about such polynomials in Chap. 6. AsolutionofV?V = Ofor 
V = V(r, @) is 


Vir, 0) = > (Amr™ + Buar~@*t)P,, (cos 6) (5.177) 


m=Q@ 


We have seen that the solution of Laplace’s equation in spherical 
coordinates led to Legendre polynomials. The solution of Laplace’s 
equation in cylindrical coordinates yields the Bessel function. In cylin- 
drical coordinates V?V = 0 becomes 


DIFFERENTIAL EQUATIONS 231 


af av 10°V aV 
Assuming V(r, 6, z) = R(r)0(0)Z(z) yields 
rdf dR r?aZ 1 d’0 
Ba (Te) +3 Ge ae ate 
If we desire V(r, 0, z) = V(r, 6 + 27, z), we need 
1 dO ; 
-6a 7” y an integer 
so that 6(0) = A, cos v6 + B, sin v6 (5.180) 
Similarly 
2 
; i == }? k real or a pure imaginary 
so that 
Z(z) = Cer + D,e7*# (5.181) 


If k is real, Z(z) is exponential, and if k is a pure imaginary, Z(z) is 
trigonometric. Equation (5.179) now becomes 


dR  1dR ee 

TEs (w-F)R =o nee) 
If we let 2 = kr, R = w, (5.182) becomes 

dw . 1dw y? 

2414 (1-F)w=0 a) 


This is Bessel’s differential equation. We see that z = 0 is a regular 
singular point. 


Problem 6. Solve (5.182) for k = 0 and » an integer. 
Problem 7. Solve (5.183) for the cases »v = integer, » ~ integer. Hint: Let 
22=47,0=2 = and write (5.183) as 


ar 


(eo? — zr?) w + 2w =0 


@ 
Let w = 2% > cz", and show that 
r=Q 


rn eee 
wi(z) = Ax’? Laer FD 


If » is not an integer, a second solution is 
(—z)’ 


w,(2) = Ba7/? x AOS EE 


232 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
We define the Bessel function of the first kind of order v to be 


(2 (= pre/2) 


An important recurrence formula for the J,(z) is 
2 
Jy1(@) + Jnui(2) = = J.) (5.185) 


To prove (5.185), we note that 


Jy—1(2) + So4i(2) = (z)" Ge 
r=Q 


ri (:)" (= 1)r(2/2)* 
2) Ly Wo Fr +2) 


_ (2) St (—2t/ayr y (= 24/4) 
2 { rit(v +r) ( rit(y +r + 2) 









r 
1 z?\" 
= » = +r) (®—-DTO+rt 5 | (- =) 


T= 
@ 


~(5) leat Daetteen(-§)| 
~ AQ T(v) rity +r +1) 4 

e\"° 1 : y 2\V 
- (3 bros + aro tren (- 4) | 


) 
7 @) eH + 1) = > Seas ARG eT) 5 | 


Zs (z)"" ated i » _v(—22/4)r 
5 To +1) = 1) TG a r+ 1) 


Equation (5.185) is seen to hold since v/['(vy + 1) = 1/T(y). 
Problem 8. Show that 


However, 


cs 75) 


Jy-s(2) — Jvsil2) = 23, (2) (5.186) 


DIFFERENTIAL EQUATIONS 233 
Problem 9. From (5.185) and (5.186) show that 


5 Joe) + J, (2) = J,_1(z) 


= Jv(2) — J (2) = Jyail2) 


J3(z) = (2) sin z 


Problem 11. Show that, if a # 8,» > —1, 


Problem 10. Show that 


(a? — B?) i. tT y(at)T »(Bt) dt = t | Jo(ax) 2 Pe) ad Hoe) =f (pr) 


: 7 on aid 2 dJ v(az) 
2a ah lJ (at)? dé = (ats? — v’)[F (ar) +[2 dz aah 


adJ er] 


Problem 12. From the results of Prob. 11 show that 
ie tJ y(at) J, (Bt) dt = 0 
provided J(a) = J(8) = 0, a # 8, vy > —1. Also show that for this case 
fp Mat dt = Fd rai(e) 


5.14. Hermite Polynomials. The differential equation 


d?w dw 
We 7 ta +nw=0 (5.187) 


arises in quantum mechanics in connection with the one-dimensional 
harmonic oscillator. We shall see that the Hermite polynomials defined 
by 

dre—/2 


Not?/2 
H(t) = (— ire 


(5.188) 


satisfy (5.187) for any nonnegative integer n. The reader can easily 
verify that H,(¢) is a polynomial of degree n. From (5.188) we have 








uo =(6 = (—1)tet/2 ——_ os a Oe (—1)"e-#/2 +. ( — te-!?/2) 
= e#/2 dve—#/2 d*—le—t?/2 
ts 1 npt?/2 feo ener as “Pn ee ee, lake eee te a 
ae (é de ae a= ) 
cena d®—\e—t/2 
es (—1)* Oe 
so that CO: nH ,,_1(t) (5.189) 


dt 
Problem 1. Show that 


Ayyelt) = tHais(t) — (n + 1)A2() (5.190) 


234 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


a@?H,, 


dH, 
“di ba + nH, = 0. 


Problem 2. From (5.189), (5.190) show that —.— 
From (5.189) we have 


Hal _ a8 = n(n — 1)Hy_2(t) 





and, in general, 


oa at n(n oe 1) see (n —r+ 1)H,_,(é) 


1dH,(t) _ (n 
and 7 dit = (") H,_,(t) (5.191) 


From Taylor’s expansion theorem and using (5.191) we obtain 


H,(t + 8) = +) ons x & H,()s* (5.192) 
0 0 


r= t= 











We now look for a generating function ¢(z, ¢) in the sense that 


a 


se » H,.(t) = (5.198) 


n=O 


If y(z, ¢) can be found, then the Taylor-series expansion of g(z, ¢) in 
powers of z will yield the Hermite polynomials. If y(z, ¢) does exist, 
then formally we have 

Op _ , Z" 

at Z H(t) n! 


n=O 


assuming that one can differentiate term by term. From (5.189) we have 


dy : an 1 
“< Y tt gy -+)) Hd) Gay 


r= 


= zp(z, t) 


Integration yields 
¢(Z, t) == esttu(s) 


where »(z) is an arbitrary function of integration. Thus 


d*¢ 
02? 


(5.194) 


esto) {u!(z) + ft + w!(z)]?} 


DIFFERENTIAL EQUATIONS 235 


From (5.193) we have formally 


oe » HO GT = Has(t) 


02 
53 ) Hyzalt) 


n=Q 
0°9 de | z 
so that ae i a » [Hn+2(t) — tH n+i(2)] 5 
n=) 


ee 


(n + 1)H,(t) ~ 


n=0 
Making use of (5.194) yields 
w'(2) + lu’ (2)? + (2 + Ou’ (2) +2 = -1 (5.195) 


Since z and ¢ are independent, we find that u(z) must satisfy uy’ + 2 = 0. 
Now u(z) = —2?/2 satisfies (5.195). The function 


(2, t) = en eatat (5.196) 


can be shown to be the generating function for the H,(¢). There is no 
trouble in differentiating the series expansion of g(z, ¢) term by term 
since ¢(z, ¢) is analytic for all z, ¢ in the complex domain. 

Finally, we show that 


0 if m ¥n 
ifm=n 


e- PH, (t)H,(t) dé = (5.197) 


il 
va | 
We consider 


l= | e PH, (t)H,(t) dt men 


Ss E (—1)™ je — 
Integration by parts yields 


ew) «| Chg 6 
= (DO) Fe [. oa eee 
dm—le—t/2 


= (yen {7 Hy_i(t) yeaa at Why? 


236 ELEMENTS OF PURE AND APPLIED MATHEMAMICS 


Further integration by parts yields 


qm—ne—#/ 2 


ee ee ee eee Cee » f- aa at 
If m > n, let the reader deduce that ] = 0. If m =n, 


IT =n! ee e#/2 dt = ~/2r n! 


REFERENCES 


Agnew, R. P.: “Differential Equations,” McGraw-Hill Book Company, Inc., New 
York, 1942. 

Bateman, H.: ‘Partial Differential Equations of Mathematical Physics,’’ Dover 
Publications, New York, 1944. 

Cohen, A.: “ Differential Equations,’’ D. C. Heath and Company, Boston, 1933. 

Goursat, E.: “ Differential Equations,” Ginn & Company, Boston, 1917. 

Ince, E. L.: “Ordinary Differential Equations,” Dover Publtations, New York, 
1944, 

Kells, L. M.: “Elementary Differential Equations,” McGraw-Hill Book Company, 
Inc., New York, 1947. 

Miller, F. H.: “Partial Differential Equations,” John Wiley & Sons, Inc., New York, 
1941. 

Petrovsky, I. G.: ‘Lectures on Partial Differential Equations,” Interscience Pub- 
lishers, Inc., New York, 1954. 


CHAPTER 6 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, 
AND FOURIER INTEGRALS 


6.1. Orthogonality of Functions. A function f(x) defined over the 
interval (a, b) may be thought of as an infinite dimensional vector. 
f(zo) is the component of f(z) at x = x. Let g(x) be defined also over 
the interval (a, 6b). In vector analysis one obtained the scalar, dot, or 
inner product of two vectors in a cartesian coordinate system by multi- 
plying corresponding components and summing. Obviously we cannot 
sum f(x)g(z) for allz,a S x Sb. The next best thing is to define 


Sf, 9) = f° fag) ax (6.1) 


as the scalar product of fandg. If S(f, g) = 0, we say that f and g are 
orthogonal on the range (a, 6). Generalizing, if 


J, e@)f(a)g(@) dx = 0 (6.2) 


we say that f and g are orthogonal relative to the weight, or density, 
function, p(x), on the interval (a, b). 

6.2. Generating Orthogonal Polynomials. In Chap. 5 we noticed that 
important classes of polynomials such as the Legendre, Laguerre, and 
Hermite polynomials arose in connection with the solutions of various 
differential equations. In this section we shall show how to generate 
orthogonal polynomials. The various theorems proved here will apply 
to every type of polynomial generated. We proceed as follows: 


‘ Let (a, b) be any closed interval, -» SaSxuZb ~, and let 
p(x) be any real-valued function satisfying the following conditions: 
(i) p(x) 2 O fora<x<b 
(6.3) 


(ii) [, e@) dz > 0 forrasa<B<sb 
(iii) The moments of p(x), say, un, defined by 
tn = [> 2%e(z) de n=0,1,2,... 


exist and are finite for n = 0, 1, 2, .... We call p(x) a weight, or 


density, function. 
237 


238 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


We now proceed to construct a sequence, or family, of polynomials, 


Pye = 1 Pie): Pea ok PAD) as 
such that 
P,(xz) = a*™ + 0,27"14+ °° > (6.4) 
and such that 
i ° o(t)Pm(x)Pa(x) dx =0 men (6.5) 


Condition (6.5) states that P,,.(x) and P,(z) are orthogonal to each other 
relative to the weight factor p(x). Note that P,(x) has its leading 
coefficient equal to unity. We apply mathematical induction to generate 
the family {P,(x)}. We have already defined Po(x) = 1. Let 


Py\(x) =xa+01 
To fulfill condition (6.5), we need 
fo(x)(x + 01) dx = 0 (6.6) 


The limits of integration are omitted with the understanding that they 
remain fixed throughout the discussion. Equation (6.6) yields 


5, = — Leela) de 
Sp(x) dx 


[see (ii) and (iii) of (6.3)]. This choice of o, yields P,(x) orthogonal to 
P(x). Now assume that we have constructed the sequence of orthogonal 
polynomials Po(x), Pi(x), Pe(x), . . . , Pe(x) satisfying (6.5). We know 
that k is at least 1. We now show that we can extend our constructed 
set of polynomials to include the polynomial P.4,:(z) while preserving 
the important orthogonality condition (6.5). Let 





k 
Pras(c) = att? + > cP (a) (6.7) 
r= 
where c,,7 = 0,1, 2, ...,k, are k + 1 undetermined constants. We 
desire 
f p(x) P(x) Proi(z) dx = 0 se0 12.4624 (6.8) 


With the aid of (6.7), (6.8) becomes 


k 
| xe+o(x) dx + , C, | o(x)P,(x)P,(x) dx = 0 (6.9) 


r=) 


However, for s ~ r we have fp(x)P,(x)P,(x) dx = 0 provided s S k, 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 239 


r < k, by our assumption on the set Po, Pi, Pe, . . . , P(x). Hence of 
necessity 
fak+1p(x) dx 


Pay de os perereS ae (6.10) 


c, = 
The numerator of c, exists [see (ili) of (6.3)], and the denominator of c, 
is different from zero and is a linear combination of various moments of 
p(x), each of which exists. We leave this as an exercise for the reader. 
It is a trivial procedure to reverse the steps taken to derive the c, s = 0, 
1,2, ... ,k; that is, if the c, of (6.10) are substituted into (6.7), one 
easily shows that P:i:(1) is orthogonal to P,(z), r= 0, 1, 2,..., 4h, 
relative to p(x). The principle of finite mathematical induction states 
that a sequence of polynomials Po(x), Pi(x), Pe(x), ..., Pr(v),...- 
exists satisfying (6.5). 


Problem 1. Prove by mathematical induction that the sequence of polynomials 
{P,(z)} is unique relative to the interval (a, b) and the density function p(z). 


For the interval (0, 1) and for p(x) = 1 the orthogonal polynomials are 
the well-known Legendre polynomials. We generate now P,(x), P(x). 
For P\(z) = x + o, we have 

i. a dx 
0 


[fae 


so that P\(v7) =a — 53. Let Pox) = 2? + 4a1P; + AoPo. From 


—— 
=—_ — 





qn= 


z 
2 


1 
I, PoP 2 dx = 0 
1 1 
we have I, x? dx + ao i dx = 0 so that a) = —¥y From 
1 
I, P,P, dx = 0 
1 
we have i (x —- grr dxr + ay i: (xc — 3)? dx = 0, so that a, = —1. 
Thus P.(z) = 22 -— (x -—%) -—-y3 =U—axt+s. 
Problem 2. For the interval (0, «) and for p(x) = e~* generate Zi(x) and Le(z). 
These are the Laguerre polynomials. 
Problem 8. Construct the Hermite polynomials H,(z), H2(z) for the interval 


(— ©, ©), p(w) = e77/?, 


An interesting result concerning orthogonal polynomials is the follow- 
ing: Let Q(x) be any polynomial of degree n. There exists a unique set 


240 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


of constants co, c1, . . . , Cn, such that 


n 


Q(x) = > c,P,(2) Cn # 0 


r=0 


The constants depend, of course, on the family of polynomials under 
discussion. The proof is by induction. If Q(x) is of degree zero, say 
Q(x) = a, then Q(x) = aPo(x). Assume the theorem true for all poly- 
nomials of degree < k. Now let Q(x) be any polynomial of degree 
k+1, Q(x) = aor*t! + ax + ++ + + Gea1, Go #0. Then Q(z) — 
adoPx41(x) is a polynomial of degree < k. By our assumption 


k 


Q(z) — aoPis(z) = ) oP,(2) 


r=0 
k+1 
so that Q(z) = > ¢-P,{x) Cha = Ao 
r=0 
Problem 4 Show that c,, 7 = 0, 1, 2, 3, ..., 4+ 1, in the preceding line are 


unique. 
Problem 5. Show that [p(z)z"P,,(x) dx = 0 for m > n. 
Problem 6. If Q(z) is a polynomial of degree m < n, show that 


Se(t) Q(x) Pr(z) dx = 0 
Problem 7. Show that the conditions 
So(x)r*P, (x) dx = 0 k=0,1,2,...,n-—-1 (6.11) 
yield a unique polynomial P,(2) = 2" + 6,7"! + - - + and that 
fol(z)Pm(r)P(z) dr = 0 


for m # n if Pm(z) is generated.in the same manner. Equation (6.11) could have 
been taken as the starting point for developing orthogonal polynomials. 

Problem 8. Fora ~ —«,b # +. find a linear transformation z = r# + s such 
that 4 = 0 when zg = a, # =1whenzx=b. Fora ¥ +o, b = +o find a linear 
transformation which maps z = a into = 0,z = + intoz = +o. 

Problem 9. Show that fxP,_:P, pdx = SP2p dz. 


6.3. Normalizing Factors. Examples of Orthogonal Polynomials. 
The constants 
v2 = [P2(x)p(x) dx n= 0, 1,2, 624 (6.12) 


are called the normalizing factors of the P,(x). It follows that 


i p(2)PH) 4g, i V 0(2) Pa(z) | dx = 1 
2 v5 


n 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 241 


We define the orthonormal set of functions {¢,(x)} by 
gr(L) = = / p(x) Pilz) n=0,1,2,... (6.13) 


The ¢,(x) are not, in general, polynomials. They possess the attractive 
property 
2 a =O ft #Jj 
S¢.(x)¢,(x) dx = 6, =| pi 


The {¢,(x)} form what is known as a normalized orthogonal system of 
functions associated with the family of orthogonal polynomials {P,(x)}. 
We now list a few families of orthogonal polynomials. 





Weight function Polynomial 


1 Os2rs1 1 Legendre 

2 -lersl (1 — x*)~3 or (1 — x?)4 | Tchebysheff of 1st and 2d kinds 
3 Osxrs w e-* Laguerre 

4 OsS2rS 0 re *,c >0 Generalized Laguerre 

5 —-*e fxr w e7 22/2 Hermite 





Example 6.1. Let a = 0, b = 1 with p(x) = 1. We choose 
P,(z) = 1 + aye + aoe? + + + + + Gna 
1 
From i z'P,(x) dx = 0 for k < n we have 


1 ay an 





Pa eer ee fork =0,1,2,...,n-1 
We can solve these n linear equations for the a,,2 = 1, 2, ... , n, as follows: We 
have 
see Me ett vee BIS, SO A i aes na a 
k+1°k+2 ktn+1 (F+1)4+2)---ktn41) 


where Q(k) is a polynomial in k of degree at most nm. Since Q(k) vanishes for k = 0, 1, 
>» ..,n — 1, of necessity 
Q(k) = Ck(k — 1)(k — 2) +--+ (kK —n +1) 
(see Lemma 1, Sec. 6.4). Thus 


k+1 k+1 k+1 _ gy kk—1) +++ (k—-n+1) 


l+agag tapas t ies ON yl 1 2) oes eo 
If we set k = —1, we obtain 
be C(-D(—2) «+ + (=) 


1-2-3---n 


242 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


so that C = (—1)*. Thus 


2 1 Qa) a, ae. An 

@) gyiteset  teer eit Tee 
(—1)"k(k-—1) +--+: (K—-n+1) 
~ (FD F2) Fn +1) 


To solve for a,, we multiply the above by k +7 + 1 and then set k = —(r +1). 
This yields 

ae (=I) + YG +2) + + 0)(-D" 

(a r)C—r £1) (7 + 2) 8 (11-2 es Qe — 72) 


= (i am ITD) 


Hence P,(z2) = S apt? = y (—1)" (*) (" r’) xr 
0 r=0 


rx 


The Legendre polynomial] with leading coefficient unity is 


P(t) = y, ry EPG) Cr 


1 
We now determine i P?(z)p(z) dx. We have 


] 1 
I, P2(x) dz I, a,x"P, (x) dz 


= oy ay i xa" dz 


il 


7=Q 
ae S ar i 1)*n!n! 
Te Pel " (2n + 1)! 


from (a) above. Thus 


it, _ ) (’ mint an)! 
I, Pra) dz = (1) (a) Ge aD! * Gab Dl * Pl 


6.4. The Zeros of the Orthogonal Polynomials. We wish to show that 
the zeros of P,(x) of Sec. 6.2 are real, distinct, and lie in the fundamental 
interval. We first state and prove two lemmas. 

Lemma 1. If 2 = 7 is a root of P(x) = 0, then z — r is a factor of 
P(x), P(x) a polynomial. The proof is as follows: We have upon division 


2) 





= Q(z) + R = a = constant 
or P(t) = Q(az)(a@-—r)/ +R 


If P(r) = 0, then R = 0s0 that P(x) = Q(z)(x — r). Q.E.D. 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 243 


Lemma 2. If 7 = a+7-b is a zero of a real polynomial, P(z), a, b 
real, then a — 7:6 1s also a zero of P(x). It is easy to see that 


P(a + 1b) = U(a, 6) + iV(a, b) 
P(a — 1b) = U(a, b) — 1V(a, b) 


Since P(a + 1b) = 0, we have U = V = 080 that P(a — ib) = 0. 

We now prove the general theorem stated above. Let P,(z) be a real 
orthogonal polynomial. If z = a + ib, b ¥ 0, is a zero of P,(z), then 
[x — (a + 2b)][zx — (a — 2b)] = (x — a)? + BD? is a factor of P,(z). This 
factor is positive for all z in our fundamental interval (a, b). This is true 
for all complex zeros of P,(z). Now letz=%,21=4%,...,2 = 2N 
be the real zeros of P,(z) of odd multiplicity lying on the interval (a, b). 
Certainly N <n. Assume N < n, and let 


Q(x) = (& — 41)(z — 22) + + + (@ — zy) Palz) 
Let the reader deduce that Q(x) > 0 for all x on (a, b) except for z = 2, 


Zo... ,tnorQ(x) < Oforall zon (a, b) exceptforz = 21,22, ... , XN. 
We have, however, 

I, p(x)Q(x) dx = [ p(x)(z — m1) ++ + (& — ay)P,(x) dx =0 (6.14) 
since (x — 21)(2 — x2) - +: (x — xy) is polynomial of degree N <n 


(see Prob. 6, Sec. 6.2). But [ p(z)Q(x) dx > 0 or [ p(x)Q(x) dx < 0 


from the nature of Q(x), a contradiction to (6.14). Hence N = n, and 
the theorem is proved. 

6.5. The Difference Equation for Orthogonal Polynomials. We obtain 
now an equation involving P,_;(z), P,(x), and Pnii(z). We have 


Popil(t) = x**! + ony it® + + 
P,(z) =g7 +o,77 1+ -°- 
so that Pysi(Z) = xP,(x) = (On+1 — On) x” 4+: 
and Pasi(z) — 2Pp(%) = (ong: — On) Pa(t) + °°: 
since x” = P,(z) — o,z""! — - +--+. Hence 


Pasi(x) — £Pp(2) — (Ong1 — On) Pr(%) = Pagi(x) — (2 + on41 — On) P2(2) 


is a polynomial of degree at most n — 1. From a previous theorem 


n—-1 


> crP,(zx) 


r=Q0 


Pasi(t) — (+ Ong — On) Pr(Z) 
—2 


= Cn-1P et + > c-P,(x) (6.15) 
r=Q 


244 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


We multiply (6.15) by p(x)P,-1(x) and integrate over the interval (a, b). 
Using the orthogonality property and the fact that 
fxPn—1(x)Pn(x)p(z) dx = JP2(z)p(x) dx = v3 
(see Prob. 9, Sec. 6.2), we obtain —y2 = cx-1.y2_,.. Thus 
n—2 


a Pils) = ; ¢P,(2) (6.16) 





Prag i(t) — (@ + Ong1 — On) Pa(z) + 
r=0 
is a polynomial of degree at most n — 2. To show that c, = 0 for 
r=0,1,2,...,m— 2, one need only multiply (6.16) by p(x)P.(z), 
s Sn — 2, and perform an integration over (a, b). Hence 





2 
Pagi() — (& + Ong1 — On) Pr(x) + a P,-1(z) = 0 n21_ (6.17) 

Equation (6.17) is the difference equation satisfied by the orthogonal 
polynomials. Using (6.17), it is possible to show that the zeros of P,(2), 
Py+i(z) interlace in the sense that, if 71 < 142 <-> ++ < 2, are the zeros 
of P,(z) and if yi < ye << °° * < Yn < Ynys are the zeros of Pyi:(z), 
then a < yi < 41 < Y2 <2 SM 62° KL Yn < In < Ynui <b. We omit 
this proof. 

6.6. The Christoffel-Darboux Identity. From (6.17) we have 





2 
Print) — (t+ ons — on)Pa(t) + Poalt) = 0 (6.18) 
n—] 


so that 
P,(t)Payi(t) — (2 + ong — on) P(r) P(t) 
+ 2" P.(t)Paa(z) = 0 
Yn—1 (6.19) 
Py (2) Pasilt) — (tf + ony1 — on)P,(r) P(t) 
- 2 
et oa P,(t)Pa-(t) = 0 








2 
os | 


upon multiplying (6.17) by P,(t) and (6.18) by P,(z). Subtracting and 
dividing by y2 yields 
Prii(t)Pr(t) oo Prii(t)P,(z) 2 P,(2)Pa—1(8) ~~ P,,(t)Pr-1(2) 


v5 Ya-1 
_ (e — )P,(2)Pa(t) 


72 n2zil1_ (6.20) 


We have also 


Pi(x)Po(t) — Pilt)Po(z) _ (@ — 11) — (¢ — 01) _ (& — YP oz) Poll) 
v5 v5 vi 
(6.21) 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 245 


If we let n = 1, 2,3, . . . , k in (6.20), add, and make use of (6.21), we 
obtain 
k 
Pryi(t)Pilt) — Peri) Pi) _ P(x) Pall) 
1 Ee) eee (6.22) 
Yi (2 =; t) 0 Yn 


Equation (6.22) is the Christoffel-Darboux identity. We define the 
kernel, K;(x; ¢), by the equation 


P, (x) P(t) 


K,(2x; t) = “yi 


n=@0 


(6.23) 


Problem 10. Evaluate K;(z; x) by L’Hospital’s rule. 

Problem 11. If z = c, x = d are distinct zeros of P,(xz), show that K;(c; d) = 0. 
What if c = d? 

Problem 12. Prove that 


f ” Kila: t)p(t) dt = 1 (6.24) 


6.7. Fourier Coefficients and Partial Sums. We may redefine the 
orthogonal polynomials so that 


b _ =0 ifm#n 
f p(t) Pm(X)Pa(x) dt = bmn _ ; ta (6.25) 
by simply letting p,(z) = (1/yn)P.(x), n = 0, 1, 2, ... (see Sec. 6.3). 


Now let us consider a real-valued function of x whose Taylor-series expan- 
sion about z = 0 exists. We have 


f(z) = z Let) as (6.26) 


n=Q0 


We see that f(x) can be expanded as a linear combination of terms from 
the sequence of polynomials 


Lr Ge Wt fy he hoy He OHk 
f(x) = > Cnt” 
n=O (6.27) 
_ f™(0) 
7 n! 


It seems logical to ask whether a given function f(z) can be expanded in 
terms of the infinite sequence of polynomials 


poz), pi(x), p2(z), Se. ag Dx(2), ph aa (6.28) 


246 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


If this is the case, we may write 


f(x) = > CnDa(2) (6.29) 


n= 


To determine the coefficients c,,n = 0,1,2, ... , we multiply (6.29) by 
p(x) p(x) and integrate. This yields 


[? olz)pe@ f(x) dx = [? Y cno(z)pe(z)pa(e) dz (6.30) 


n=Q 
Assuming further that the series expansion converges uniformly on (a, b) 
yields 
b = b 
[P elz)pew)f(x) dz =) en f (z)pe(z)palz) dx = c (6.31) 
n=(0 
The c., k = 0,1, 2, ... , of (6.31) are defined as the ‘Fourier’ coeffi- 
cients of f(z). If lf : p(x)f?(x) dx exists, then from the Schwarz-Cauchy 


inequality for integrals we have 
b b 
leal? Sf” o(w)f%(x) de [” o(@)pi(@) de (6.32) 
Thus c, given by 


Oo = f ° o(x)p(x)f(z) dz  &=0,1,2,... (6.33) 


exists provided f : p(x)f?(x) dx exists. Thus one can speak of the Fourier 


coefficients of f(x) regardless of the existence or nonexistence of the expan- 
sion of f(z) in the form given by (6.29). 


If the c., k = 0,1,2, ... , n, of (6.33) exist, we can form the series 
8.(z) = > C.p;(Z) (6.34) 
i=l 


&,(z) 18 called the nth partial sum of the Fourier series associated with 
f(x). With the aid of (6.23) and (6.33) we have 


n 


s(z) = f° ) ofp(d)p.(a) at 


+=0 


is [, ” K(x; Of (p(t) dt (6.35) 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 247 
The difference between f(x) and s,(zx) is called the nth remainder in the 
Fourier series of f(z). We have 
R,(x) = f(z) — 8,(z) 
b b 
= ['s@)Kala; Dold) dt — f° Kal; fo at 


= |" Kale; Dolf@) — FO] dt (6.36) 


since [, ” K(x; t)p(t) dt = 1 [see (6.24)]. Now 


K, (x; t) = Yet! [Paes(t) Pall) — Pa+r(t) Pal) 
oa Yn xr—t 
so that 


b — 
Role) = Tf Ipasste)pn(0) — panOpa L210 gy ae (687 


The reader should explain the apparent difficulty att = 2. If 
lim R,(r) = 0 


N— > © 


the Fourier series, > CnDn(x), converges to f(z). 


n=@ 
6.8. Bessel’s Inequality. We show first that R,(x) = f(x) — 8,(zx) is 
orthogonal to p,(z), k = 0, 1,2, ...,n. We have 


[? f@) — s0(2)Ipu(z)0(z) de = [” f(a)p(@) (a) do 
= > cs [ p(x) pe(z)p (x) dx 


t=) 
= Ct — C, 54% 
‘ 
=C¢ — Cc. = 0 (6.38) 
We show next that 
Jo P@)ee) dz = ) (6.39) 
k=0 


The proof is as follows: Clearly 


[ [f(x) — s,(x)]*p(x) dx 2 0 


so that [ ° (2) p(x) dz — 2 i ” £(22)&q(x) p(x) dx + [. ° 92(2) p(x) dx = 0 
(6.40) 


248 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


However, 
La (x)8n(x) p(x) dx = > Cy i : f(x) pe(x) p(x) dz 
k=0 
= > Cc; 
k=0 
and JP sx@)0(@) dz = YY eee [ pila)pe(z)e(a) de 


SN 
H 
o 
= 
tt 
© 


C,Ck O54 


hms 


o 
ero 


&, 
R42 (M4 2 
= 
ll 
Q 


I 
° 


so that (6.39) follows as a consequence. Since (6.39) holds for all n, the 


n 


sequence t, = c?, n= 0, 1, 2,... , is bounded. Moreover, this 
k=0 
sequence is monotonic nondecreasing. This implies that lim ¢, exists. 
n—> 0 


Let the reader deduce that 


[’P@ola)dez Y of (6.41) 


k=0 
Formula (6.41) is Bessel’s inequality. 


Problem 18. Why is it true that lim cg = 0? 
ko 
Problem 14. If f(x) is a polynomial of degree n, show that 


; ‘ 
[ f(x) p(x) dz = Zz Ck 


Problem 15. For (a, 8) such thata S$ a S$ B S 6b assume that 
8 
d, = [i g(x) pn(x)p(x) dz 


exists. Let f(z) = g(x) fora S$ xz S B, f(x) = 0 otherwise. Show that lim d, = 0. 


nr— © 


6.9. A Minimal Property of the Partial Fourier Sums. We look fora 
polynomial Q(x) of degree < n such that 


I = |’ [f(@) — Q@)Po(e) de 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 249 


shall be a minimum. Let s,(x) be the nth partial Fourier sum of f(z). 
Then 


b 
T= [?(f(@) — sa(z) + s(x) — Q(@)]o(2) de 
b b 
= [- (Fe) — sa(2)Po(@) dz + 2 f° Li(@) — sa(z)Ilsa(z) — Q@)]o(@) ax 
b 
+ |? [sn(z) — Q(z) Pe(2) dx 

The second integral vanishes from the results of Sec. 6.8 [see (6.38)] since 
s,i(z) — Q(x) is a polynomial of degree S$ n. The first integral has a 
fixed value. Hence J will be a minimum when i ; [sn(z) — Q(x) ]2p(x) dx 
isa minimum. This obviously occurs if we choose Q(z) = s,(x). This 
is a highly important property of s,(x). 

Problem 16. Let Q,(xz) be a polynomial of degree n with leading coefficient unity, 
If if : Q?(x)p(x) dz is to be a minimum, show that Q,(z) = P,(z). 

Problem 17. Let {P,(x)} be a family of polynomials with leading coefficients equal 


b 
to unity, P(x) of degree n such that f P2(x2)p(x) dzisa minimum. Without using 
the result of Prob. 16 show that 


b 
i PMP@sode0 is 
Problem 18. Show that y°,, < vy; for the fundamental interval (—1, 1). Hint: 
1 1 1 
fs P?41()p(2) dx $ je x?P? (2)p(r) dx < tee P?(x)p(x) dx 


6.10. Complete and Closed Sets of Orthogonal Polynomials. A set of 
orthogonal polynomials is said to be complete if for every continuous 
function f(z) we have 


lim S R2(x)p(z) dx = 0 (6.42) 


N—> @ 


_where F,(z) is defined by (6.36). From Sec. 6.8 we have 


n 


- R2(x) p(x) dx = [P@e@) re y oe 


k=0 


so that (6.42) is equivalent to the statement 


[-P@e@) de = ) a (6.43) 


0 


ih 


We shall prove in Sec. 6.11 that the orthogonal polynomials generated 
above are complete for any finite range (a, b). 


250 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


DEFINITION 6.1. A set of orthogonal polynomials is said to be closed if 
the vanishing of all the Fourier coefficients of a continuous function f(z) 
implies f(x) = 0. In other words, if f(z) is continuous and if 


f ° f(x)p,(z)p(z) dx =0 n=0,1,2,... (6.44) 


then f(z) = 0. 


Problem 19. Show that a complete set of orthogonal polynomials is a closed set. 
b 
Problem 20. Show that (6.44) implies A, = i x"f(x)p(z) dx = O forn = 0, 1, 2, 
a 


. , and conversely. 


6.11. Completeness of the Orthogonal Polynomials for a Finite Range. 
By a linear transformation the fundamental interval (a, b) can be reduced 
to the interval —1 S$ x S lifaand bare finite. The famous Weierstrass 
approximation theorem states that a continuous function on a closed 
interval can be approximated arbitrarily closely by a polynomial (see 
Courant-Hilbert, ‘‘Mathematische Physik,” vol. 1). In our case we wish 
to approximate a continuous function f(z) on —1 S$ x S$ 1 to within 
“/ «/uo, Where ¢ is an assigned positive number and wo is the first moment, 


bo = | 7 p(x) dx. The Weierstrass approximation theorem states that 


there is a polynomial Q(z) such that 


f(z) — Q(x)| < A -ls2rs! (6.45) 


We denote the degree of Q by m so that, from Sec. 6.9 for n 2 m, 


1 1 
[ve — 8,(x)]}*p(z) dz S i [f(z) — Q(x)]?o(x) dx 


1 - 1 
and [. (f(x) — sa(x)]* p(x) dx < ‘fT p(x) dz = ¢ 


Expanding f (f(x) — s,(x)]2p(x) dx as in Sec. 6.8, we have 


[2 P@oela) de - ad <e 


k=0 


But cz — c? < efor n sufficiently large so that 
k=0 k=0 


| [2, P@r@ ae - y cb | < 2 


k=0 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 251 


Since | i f?(x) p(x) dx and > c? are fixed constants whose difference can 
k=0 
be made as small as we please, we must have 


[/,P@e@) dz = ) ef 


k=0 
This completes the proof [see (6.43)]. As an immediate corollary it 
follows that for f(z) continuous the relations | 7 x"f(x)p(x) dx = 0, 
n=0,1,2,..., imply f(z) = 0. 

6.12. The Local Character of Convergence of the Fourier Series of f(z). 
We are now in a position to examine the Fourier remainder R,(zx) [see 
(6.36)] for the range —1 S231. Let us assume that the p,(z) are 
uniformly bounded on this range. This means that |p,(z)| <M < o, 
—1s2 #1, forall n, Maconstant. Let 6 be any small positive num- 
ber. Then for —1 S$ a S$ 1 


R, (0) = f(to) — 8n(o) 


zo— 5 zot+é 1 
og ae! || Mn(x, t) dt + | Mn(x, t) dt + i in(z, b) a 
Yn —1 zo— 6 rot 6 
(6.46) 


where —pin(, t) = [Pnsi(2)Pall) — Pasi(t)pa(z)lo(t) 2 — IO a0 = Ho 

[see (6.37)]. For the intervals —1 S ¢ S 2 — 4, Xo a 5 é t <1 the 

f(xo) — f(t) 
Xo —t 


function is well behaved. Hence 


lim [A uaa, t) dt = 0 
‘ 1 
ae tes Bole chat) 


(see Prob. 15, Sec. 6.8). Hence 
lim Ra(xo) 


n— 0 ag = 
= tim Tt fpasslcodeatd) — paleedPani(O]o( 122 = IO a 


We need not worry about lim ya41/yn since 0 < Ynii/yn < 1 (see Prob. 
18, See. 6.9). Whether lim F#,(20) is zero or not depends entirely on 
the behavior of f(é) in the neighborhood of t = zo. If, for example, 


f(to) — f Se =4 (2) 


be 


<A< om 








252 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


A = constant, for all ¢ near zo, then 
|Rn(ro)| S 6M?Abyo 


for n sufficiently large. Since 6 can be chosen arbitrarily small, it follows 
that lim R,(x0) = 0. 


For a complete discussion of orthogonal polynomials the reader is 
urged to read G. Szego, ‘“‘Orthogonal Polynomials,’”’ American Mathe- 
matical Society Colloquium Publications, vol. 23. 

6.13. Fourier Series of Trigonometric Functions. In the early part of 
the nineteenth century the French mathematician Fourier made a 
tremendous contribution to the field of mathematical physics. A study 
of heat motion (see Example 22, Chap. 5) led Fourier to the idea of 
expanding a real valued function f(z) in a trigonometric series. 


f(z) = da0 + a, cosz + a.cos2a+°°:°: 
+bisinz+bosin2x+-:: 


= $9 + > Gn COS NX + » b, sin nz (6.47) 


nw] n= 


At the moment let us not concern ourselves with the validity of 47). 
We wish to determine a,, n = 0,1,2,...,andb,,n = 1, 2, , if 
(6.47) were valid. From iizonometty: we have 


sin nx sin mx = 3[cos (n — m)x — cos (n + m)z] 
sin nx cos mx = g[sin (n — m)zx + sin (n + m)z] 
cos nx cos mx = 4[cos (n — m)x + cos (n + m)z] 


so that 
[" : : 0 iin ~*~ m 
sin nx sin mx dx = : 
—" T ifn =m 
/ _ sin nz cos mz dx = 0 (6.48) 
f° 0 if Nn xem 
cos nz cos mz dx = 5 
-s T fin=m 


provided m and n are integers. 

If we multiply (6.47) by cos mz and integrate term by term (we are not 
concerned at present with the validity of term-by-term integration), 
we obtain by use of (6.48) 


Qn | cos mx f(x) dx m=0,1,2,... 


—= 


Similarly (6.49) 
Bm | sin mx f(x) dz m=1,2,3,... 


—F 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 253 


The a’s and b’s of (6.49) are called the Fourier ‘‘coefficients”’ of f(z). 
These coefficients can exist [if the integrals of (6.49).exist] regardless of the 
possibility of expanding f(z) in the series given by (6.47). We can 
replace the interval (—7, 7) by the interval (0, 27) if we so desire. The 
results of (6.48) and (6.49) hold for the interval (0, 27). 


"fh a 
Example 6.2. We shall see later that f(z) = z, for —x S x S 7, has a Fourier- 
series development. We calculate the Fourier coefficients given by (6.49). 


1 fr 1 fx sin mz 
a - xz cos mz dz = - {| ————— 
T —T T m 


rs 





— = : sin mx dz) 


—* mj—« 


— [cos mr — cos (—mrm)] = 0 m #0 


a == [" zdz =0 
Tv —7 
] " 


; 2 2 
im = = | xsin mz = — — COS Mr = — — for m even 
Tn J—"% m m 
4 for m odd 
m 
Hence f(z) = 2(sin x — ¥ sin 2z + | sin 3x — {sin 42 + - - -) (6.50) 


Since the right-hand side of (6.50) is periodic, its graph is given by Fig. 6.1. We note 





Fia. 6.1 


‘that f(r/2) =7/2 = 211-3 +%—7+ °°). This checks the result obtained 
from 


rs - 1 dz 1 
4 = tan = [ppt f, Ge tettatt ae 


Slag ie eats 


provided the series can be integrated term by term. 
Example 6.3. The function f(x) given by 


f(z) = 0 —-rszs0 
f(z) =z Os2ztSr 


will be seen to have a Fourier-series expansion. The Fourier coefficienté of f(z) até 


254 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


0 = 
f 0-08 nz dz += f z cos na dz = —-[(—1)" — 1] 
KIJj~-« r J0 n°K 











— 1)"n+1 
bn == [fesin ne de ay Gost? a 
x JO n 
1 Ls w 
ao = *f zdz = 5 
0 ea a” 
_? (~1)" -1 (-1y '® 
Thus f(x) = 4 + > | 3 cos nz + sin nz 
n=) 
er 28 cos (2n — l)x _ . (—1)" sin nz 
oe) ae or 
n=] n=] 


For x = 0 the right-hand side of (6.51) becomes 


2 
Is > aa aye = al In complex-variable theory it can be shown that 


n=] 


wala 
is 





tan z 9 y 1 — 
z (n + 4)%x? — 2? 
n=0 


ar? 
If we allow z to tend to zero, we obtain S at 77 = a the desired result. The 
n=O 
limit process can be justified. 
For z = —xz and zx = x the right-hand side of (6.51) becomes 


ey ee 1 
4 (Qn —1)? 4° (Qn — 3)? 


nw n 





Notice that a2 ae z ose =": "The rightchand side-of (6.61) when -évalu- 


ated at either end sans t= se yields the mean of f(z) at these end points. The 
graph of the right-hand side of (6.51) is given by Fig. 6.2. 





ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 255 
Example 6.4. Let f(z) be analytic for |z| > 0. Then 


f(z) = Gnz” 
+ p) 2 (6.52) 
1 f) dg 


a Ori Cc cnt 





Qn 


where C can be chosen as the circle |z} = 1. On C we have ¢ = e'*, dt = te*” dy, so 
that 


oe 


fle) = ayer 


n= — 


1 = 1~ —nme d 
an, = x Is f(e")e 9 


For z on the unit circle |z| = 1 we have z = e*® and 


fle) = Y agent 
one (6.53) 
} 2r ee 
a = 5 [Seren de 


If we define g(6) = f(e'®), (6.53) becomes 


wo 


g(6) = anen?9 
ni=—~® (6.54) 


1 2nr Sing 
Gn = 5 I, g(p)e ¢ 


Example 6.5. Wecan employ the results of Sec. 5.7 to evaluate the Fourier coeffi- 
cients of f(z). A particular solution of 


y' + ny = f@) (6.55) 
is given by 


yi) = [soa - 0 at 
where g(z) is a solution of y’’ + n?y = 0, g(0) = 0, g’/(0) = 1. Thus 
y(z) = ; I ” f(t) sin n(x — 2) dt (6.56) 
is a solution of (6.55) with y(0) = 0, y’(0) = 0. Evaluating (6.56) at z = 29 yields 
on) = ~+ fy) sinntd 
y(2r) = -5 |, f(t) sin nt dt 
7 1 [2 ; ~ae ny (2x) 
so that ba = i. f(t) sin nt di Pe (6.57) 


An analogue computer can be used to solve (6.55) subject to the initial condJfions 
y(0) = 0, y’@) = 0. 


256 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


A graphical solution can be obtained for y(z)._ The height of the ordinate at z = 27 
yields y(27), which in turn yields b, of (6.57). Differentiating (6.56) and evaluating 
at x = 27 yields 


Qn 
ja | fi coma O38 (6.58) 
wr JO w 


Problems 


Find the Fourier series for the following functions defined in the interval —7 < 
xu<m: 

1. f(x) = Ofor —7 <2 S$0,f(x) =l1forO0<a<r 

2. f(x) = |x| for —-r7 <a <a 

3. f(z) = e* for —-17 <a <r 

4. f(x) = cos? x for -r <2<r 

5. f(z) = cosa for |z| S$ a # 0, f(x) = cos x otherwise 

6. Let f(z) be an even function, that is, f(z) = f(—z). Show that 


bn -if" f(z) sm nz dz = 0 
7. Let f(z) be an odd function, that is, f(z) = —f(—z). Show that 
Qn --[* f(z) cos nx dx = 0 
8. Show that any function can be written as the sum of an odd function and an even 
function. 
9. Find the Fourier series of f(z) = |sin z|, -—7 Sz Sr. 


6.14. Convergence of the Trigonometric Fourier Series. We wish now 
to investigate the convergence to f(x) of the Fourier series 


A 
340 + Gn COSNX + b, sin Nx (6.59) 
z . 
a, == f(t) cos nt dt n=0,1,2,... 
Pe wes 
b, = : | f(t) sin nt dt 


Some preliminary discussions are necessary. First let us consider 
lim |" F(é) sin kt dt (6.60) 
k- =o 


Intuitively we feel that for very large k the function F(z) sin kz will be 
positive about as often as it is negative (with the same absolute value) 
since sin kz will oscillate very rapidly for large k. Since the integral 


| = F(t) sin kt dt represents an area, we shall not be surprised if 
* ¢ 


q 


lim | " F(t) sin kt dt = 0 (6.61) 
kao J 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 207 


Certainly much will depend on the nature of F(z). It is easy to see that 
(6.61) holds true if F(x) has a continuous first derivative for —r Sz Sr. 
Integration by parts yields 


| F(x) sin ka dx = ee 





+ : | F’(x) cos kx dx 
From the fact that F’(z) is bounded on the closed interval -r SxS 
it follows immediately that (6.61) holds true. 

Actually we can weaken or make less stringent the conditions on F(z) 
in order that (6.61) hold true. Nothing need be said about F’(x). Let 
F(x) and [F(x)]}? be integrable on the interval -—r Sx Sr. We now 
define s,(x) by 


8,(£) = 409 + » a, cos kx + > b, sin kx (6.62) 
k=1 k=] 
a = > | F(t) cos kt dt 
1 


b, = Af F(t) sin kt dt 


S,(x) 1s called the nth partial Fourier sum of F(z). Note that s,(z) is a 
finite trigonometric series. Moreover s,(x) is continuous. From 


[F(x) — sa(x)P = [F(a))? — 2F(x)sn(x) + [sn(x))? 


we have 
[7 Fo - spat = [" FOPa—2[" FOs@ a+ [7 [sOP ae 


One can easily show (see Sec. 6.8) that 
Tv 2 = 
| F(t)s,(t) dt = 5 om > (az + 02) 
aa k=1 
™ 2 
| [sn(t)]? dt = > + = > (a + 58) 
“ k=1 
From the fact that [fc [F(t) — sa(t)]? dt = 0 we deduce that 


as i 7, (a? +b) < : [i F(t) }? at (6.63) 
k=] 


258 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The left-hand side is monotonic increasing with n and is bounded by 


the constant : i [F(t)]? dt. Hence 


ab 2 (i + 0p s+ L FOR at (6.64) 
k=] 


This is Bessel’s inequality [see (6.41)]. Since the series of (6.64) con- 


verges, of necessity, 
lim a, = 0 lim b. = 0 
k— & 


k— 0 
lim b, = 0 is precisely the statement (6.61). 
kk & 


A simple class of functions which are both integrable and integrable 
square is the set of functions which have a finite number of bounded dis- 
continuities in the sense that if z = cis a point of discontinuity of such a 
function then lim f(z) and lim f(z) both exist but are not equal. We 

xe 2 


z<e Z2>c 


lim f(z) = f(c — 0) 
lim J(2) = fic + 0) 


z>c 


write 


A function which has a finite number of bounded discontinuities of the 
type described above is said to be sectionally, or piecewise, continuous. 
Now let f(x) be sectionally continuous for —r S x S =, and assume that 
f(z) has the period 27, that is, f(x + 2r) = f(z). We make use of the 
periodicity of f(x) in (6.67) below. It is not necessary, however, that 
f(r) = f(—7), since the value of f(x) at one point does not affect the value 
of the integral [see (6.67)]. In most cases f(x) 1s defined only for the 
interval —r Sz Sr. Weeasily make f(z) periodic by defining f(z) 
at other values of x by the condition f(z + 2r) = f(z). 

The nth Fourier partial sum of f(z) can be written as 


Sn(z) = af f(é) a+i) pf f(t) cos kt cos kz dt 
k=l 
+1) / f(t) sin kt sin ke dt 
k-1° 


= 1f 50 Ee >, co R(t — 2) | a 


k=1 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 259 


It is important to note that s,(zx) is a finite trigonometric series and that 
8,(z) is continuous. As long as the Fourier coefficients exist, it 1s always 
possible to construct s,(z). Thus s,(x) exists independent of the possible 
development of f(z) as a Fourier series. 

We desire to show that, under suitable restrictions concerning the first 
derivative of f(x), 

lim 8.(x) = g[f(z + 0) + f(x — 0)] (6.65) 
From " 
eFi(t-z) = cos k(t — x) +72 sin k(t — 2) 
e~ki(t—2) = cos k(t — x) — 7 sin k(t — z) 


the reader can show that 
1 


con k(t — 2) = —1 48+ DE-2) 66 


2 sin $(t — 2) 
k= 
soit eis sin (n + $)(t — 2) 
Thus 8,(x) = . [50 neo dt 


nol Ope sin (n + $)u 
a [se + uy) 2 sin (ii/2) ae 


1 {* sin (n + 3)u 
= . ic + u) Dain (u/2) du (6.67) 


The last integral results from the fact that the integrand has period 2z. 
From (6.66) the reader can deduce that 


_ 1 [” sin (n+ 49)u 
=e] Sante 


T 
sin (n + $)u q 


so that f(x) = . [- f(x) 2 sin (u/2) U 


sin (n + 3)u J 


and 3,(z) — f(z) =i f" [f(z + u) ~ S@)) Fsin (a2) Uu 


ee . f fete-fe pan a) sin (n + 3)u du 


Now consider z fixed, and define 


_f(e¢+u) — fl) u 
es u 2 sin (u/2) 


F(u) will be sectionally continuous if f’(z) exists, for in this case 
lim F(u) = f’(z) 


Hence lim [" F(u) sin (n + 3)udu = 0 (see Prob. 2 of this section). 


260 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


We have shown that, if f(x) is sectionally continuous for —r S x S 7 and 
if f(x) has a period 27, then, at any point x such that f'(z) exists, the 
Fourier series of f(z) given by (6.59) converges to f(z). 

It may be that f’(x) does not exist at the point z but that 


isi fe + u) —~fa+0) _ f(x + 0) 


u—0 U 
u>d 


lim fle +u) — fe — 9) _ f'(x — 0) 

u—0 U 

u<0O 
do exist separately. We call f’(z + 0) and f’(z — 0) the right- and left- 
hand derivatives of f(z), respectively. Now let the reader show that 
Sn(z) — [f(x + 0) + f(x — 9)] 


=} [fetw foro on (n +4)udu 
™ Jo 


u 2 sin (u/2) 
1 [° fi@+u) — f(x —0) u : 
ae i u 2 sin (u/2) Sere ged 
It follows that if f’(2 + 0) and f’(z — 0) exist then 
lim S(t) = glf(x + 0) + f(x — 0)] (6.68) 


We formulate the above results in the following statement: 

THEOREM 6.1. Let f(z) be sectionally continuous for —r S$ z S 7, 
and let f(z) be periodic with period 2x. At any point x such that f(z) 
has both a right- and left-hand derivative the Fourier series of f(x) will 
converge to the mean value of f(x) at x defined by s[f(z + 0) + f(x — 0)]. 

It should be emphasized that a sufficient condition for f(x) to be 
written as a Fourier series has been given. There are no known necessary 
and sufficient conditions for f(x) to be developed in terms of a Fourier 
series. Study of the Lebesgue integral yields greater insight into the 
development of Fourier series. References to this line of study are given 
at the end of this chapter. 


. Problems 
1. From the identity 


U 


2 sin 5 


cos ku = sin (k + zu — sin (k — zu 
deduce that 
. 1 


2 2 sin (u/2) 
k=l 


Hint: Let k = 1,2, ... ,, and add. 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 261 
2. Under the restriction that F(z) and [F(x)]*? be integrable we have shown that 


T 


lim F(t) sin nt = 0 


N-——> © 
lim |* F(t) cos nt = 0 
no Je 


For the same F(z) why is it true that lim 7 F(t) cos (t/2) sin nt dt = 0? Show that 


n—> © 


lim |" F(t) sin (n + 3)tdt = 0. 
n—> 0 —7T 

3. Let f(z) have a continuous derivative for —r S x S 7x, so that |f’(z)| < M for 
—xr Sx 7. Show that the Fourier coefficients of f(x) satisfy 

las] < 20 Ke 1 2B yew 

What further restriction can be imposed upon f(z) so that |bk| < 2M /k? 

4. Let f(z) and [f(x)]? be integrable for the range —r S 4 S xr. Consider the finite 
trigonometric series 


n n 
Sn(z) = Fao + > ax. cos kx + > b, sin kz 
k=1 k=1 


Show that J = i [ (f(z) — S,(x)]? dz is a minimum for 
a => [* f(x) coska dz k=0,1,2,... 


I 
ll 


br > [7 #02) sin kedx k=1,2,3,... 

6. Let f(z) = sin (1/r), c # 0, f(0) = 1. Is f(x) sectionally continuous? 

6. If f’(z) is sectionally continuous for —a S x S 7, show that f(z) is continuous 
for —r7 <2 <7. 

7. If f(z) is a continuous function such that f(z + 7/3) = f(x), show that its 
Fourier series has the form 


+ » an cos 6nz + » ba sin 6nzx 


n=] n=] 


Qo 


2 


8. Let (an, 0.) be the Fourier coefficients of f(x), {an, B.} the Fourier coefficients of 
“ g(x). Find the Fourier coefficients of 


A(z) = i " fle ~ dglt) a 


9. Let f(z) be continuous on the closed intervala S$ x S$ 6. Subdivide this interval 
into @ = 2X, %1, 22, .. . » tn = b. Now write 


n 


b xs 
f, f(z) sin ax dz 2 i: . f(z) sin ax dz 


+= 


= 2 [P t9@ - fay) sin ax dz + Z [Ps sin a dz 


262 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


From the fact that f(x) is uniformly continuous deduce that 


b 
lim f f(z) sin ax dx = 0 
a 


a> 0 
Extend this result for f(z) sectionally continuous ona S$ z S b. 


6.15. Differentiation and Integration of Trigonometric Fourier Series. 
We have seen that under suitable restrictions the trigonometric Fourier 
series of f(x) will converge to f(z). Now we are concerned with term-by- 
term differentiation of a Fourier series. Let us return to the Taylor- 


series expansion of f(z). If » a,z" converges to f(z) for 0 S |z| <r, we 


n=O 
a 


know that > a,nz"—! will converge to f’(z) for 0 S |z| <r. In the case 


n=1 


of Fourier series, however, a simple example illustrates that the new 
Fourier series obtained by term-by-term differentiation may not converge 
to f’(x) even though f’(xz) exists. The Fourier series of f(z) = x, —9r < 
x & 7, 1s given by 


2(sin x — ¥ sin 2x + 3 sin 82 — ¢ sin 4x + --::) 
Term-by-term differentiation yields 
2(cos x — cos 2x + cos 3x — cos 4x + - - -) 


This series cannot converge since cos nz does not tend to zero as n 
becomes infinite. 

Certainly term-by-term differentiation of a Fourier series leads to a 
new Fourier series. If this new series converges to f’(x), we shall wish to 
make certain that f’(z) has a convergent Fourier series. From Sec. 6.14 
we have seen that a sufficient condition for this is that f’(~) be sectionally 
continuous and that f’(z) have periodicity 27. It is not necessary that 
f'(—7r) =f'(r). Let the reader show that, if f’(x) is sectionally con- 
tinuous, then f(z) is continuous. Furthermore if f’(z) has periodicity 
2x, the reader can show that f(—7) = f(r) is sufficient to guarantee that 
f(x) has periodicity 2x. We state and prove now the following theorem: 

THEOREM 6.2. Let f(z) be continuous in the interval —7r S x S 7p, 
f(—7) = f(x), and let f’(z) be sectionally continuous with periodicity 27. 
At each point x for which f’’(z) exists the Fourier series of f(x) can be 
differentiated term by term, and the resulting Fourier series will converge 
to f’(x). 


The proof is as follows: From Sec. 6.14, f’(z) can be written 


f(x) = gay + > a’ cos nz + > bf sin nx 


n=] nm] 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 263 


with ai a f’(x) cos nx dx n=0, 1,2, . 


bi, Af f'(x) sin nx dz n=1,2,3,... 
Integration by parts yields 


a, = 





f(z) cos nz + = | f(x) sin nz dx 


a [— 


| 
n [* 
ee | f(x) sin nz dz = nb, 


since f(—7) = f(r) by assumption. 0, 1s the nth Fourier sine coefficient 
of f(z). Similarly 
bi = —nan 


so that f(x) = > —Nd, sin nx + » nb, COS Nx (6.69) 


n=] n=] 


The Fourier series of (6.69), however, is the Fourier series obtained by 
term-by-term differentiation of 


f(z) = $ao + > Qn COS nx + » b, sin nz 


n=] n=l 


This proves the theorem stated above. 

We turn now to the question of term-by-term integration of a Fourier 
series. The reader knows from real-variable theory that integration 
tends to smooth out discontinuities, whereas differentiation tends to 
introduce discontinuities. As an example, consider the function f(z) 
defined as follows: 


f(z) =0 for -° <2Z 
f(z) = 2 for OS-2z 
1(2).=2 for l<z 


A WA IA 
= 


f’(z) does not exist at x = 0 and at x = 1. Moreover f(z) is discon- 
tinuous at z = 1. However, the Riemann integral of f(x) defined by 


F(z) = [*_ slat 
-is continuous for —-» <2< o. Atany pointz ~ 1 {x = 1 is the only 


discontinuity of f(z)] we have F’(z) = f(z). Thus F’(0) exists, whereas 
f’(0) does not exist. Let the reader show that F’(1) does not exist. 


264 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


We find that very little is required to integrate a Fourier series term by 
term. We state and prove the following theorem: 
THEOREM 6.3. Let f(z) be sectionally continuous for —7 Sz Sz. 


If 4a9 + > a, cos nz + > b, sin nx is the Fourier series of f(x), then 


nw n=l 


oO 


| f(z) dz = 5 (@ +7) + Y, St sin nz - », 2 (cos na — COS nm) 
n=] n=] 
(6.70) 


Equation (6.70) holds whether the Fourier series corresponding to f(x) 
converges or not. Equation (6.70) is not a Fourier series unless ao = 0. 

The proof is as follows: Since f(x) is sectionally continuous and hence 
Riemann-integrable, we note that F(x) defined by 


F(z) = [* fle) dx — gave (6.71) 


is continuous. Moreover F’(x) = f(x) — yao except at points of dis- 
continuity of f(z). If x = cis a point of discontinuity of f(z), the reader 
can easily verify that F’(c — 0) and F’(c + 0) exist. From (6.71) it 
follows that F(—7) = F(mr) = aor. F(x) can be made periodic by 
defining F(x) for |x| >a by the equation F(a + 27) = F(x). From 
Sec. 6.14, F(x) has a Fourier-series development 


s(P(a + 0) + F(a — 0)] = Ao + > An Cos nx + » B, sm nz 


1 n=) 
or F(x) = 3Ao 5) A, Cos nx + ) B, sin nx (6.72) 
n=1 n=1 


since F(z +- 0) = F(2 — 0) from the continuity of F(z). Integration by 
parts yields 





A, = af F(x) cos nx dx 
=f) sin nx — -2f" F'(z) sin nx dx 
= — 1 _I@) - $40] Sin nz dx 
ni 


‘ 1 
eae See pe 0 
ce Wier sin nx dx . b n # 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 265 


Similarly B, = (1/n)a,. We now write 


a 


F(a) = 7Ao0 — ye cos nx + yee sin nx 


n=1 n=1] 


From F(r) = aor we deduce that Ay = aor + > ~ ba cos nm. Using 
n=1 


these results along with (6.71) yields (6.70). Q.E.D. 


Problems 


1. Differentiate and integrate the Fourier serics for those functions of Probs. 1, 2, 
3, 4, 5, 9, Sec. 6.13, which admit these processes 
2. Sum the series 
1 1 1 (—1)" 


Aer gia Eee + On +1 


git 5 73 oe 


3. It can be shown that if f(z) is continuous for —x S$ x <a, f(r) = f(—7), and if 
f’(z) is sectionally continuous for ~z S x S 7, then the Fourier series of f(z) con- 
verges uniformly and absolutely for -x Sz <7. Multiply 


oo 


f(x) = gan + > Qn cos nx + > b, sim nx 


n=1 n=] 


by f(x), integrate term by term (this is permissible since the new series converges 
uniformly), and show that 


[7 ttcor ax = | 408 + y (az + 62) | 
n=] 


This identity is due to Parseval. 


6.16. The Fourier Integral. If f(z) is defined for x on the range 
—*o <x < o and if f(x) is not periodic, it is obvious that we cannot 
represent f(x) by a trigonometric Fourier series since such series, of 
necessity, must be periodic. We shall show, however, that under certain 
conditions it will be possible to obtain a Fourier integral representation 
of f(x). 

THEOREM 6.4. Let f(z) be sectionally continuous in every finite 
interval (a, b), and let f(x) be absolutely integrable for the infinite range 


—-x <24< ~, that is, | \f(z)| dx = A < o. At every point z, 


—«o <x < », such that f(r) has a right- and left-hand derivative, f(x) 
is represented by its Fourier integral as follows: 


olf(x + 0) + fl — 0)) = - I dox [ f(t) cos a(t — x) dt (6.73) 


266 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


We proceed to the proof in the following manner: The result of Prob. 9, 
Sec. 6.14, as well as the methods of that section, enables us to state that 


b ‘ 
Hie +0) + fle — 0)] = lim + : ji) AED) ay 6.74) 


at any point x for which f(x) has a right- and left-hand derivative, 
a<z2<b. Now we investigate the existence of 


; foe att = ) a (6.75) 
For a < x we observe that 


en OL 
— 0 | ee 


i—<2 la — 2 
The convergence of : - | f(t)| dt shows that for any «/2 > 0 an A exists 
such that fora S$ A 








| f(é) sin a(t — 2) dt| << 
= t—2x 2 


independent of a. Similarly one shows that for any €/2 > 0 a number B 
exists such that for b 2 B 


| * f(t) sin a(t — z) 
b 


€ 
= it) <§ 


independent of a. Thus the right-hand side of (6.74) can be made to 
differ from 


a sin a(t — 2) 
fim 2 J" 10 ES a 


by any arbitrary e > 0 provided —a and b are chosen sufficiently large. 
This implies immediately that 


se +0) + Fe - 01 = tim? {" pq B= 9 aw 76) 


Now f cos u(t — «) dy = sno — 2), so that (6.76) may be written 


se +0) + 4@— 0] = tim? {” 90 at [cos wt = 2) du (6.77) 


Before we pass to the limit as a> ©, we wonder if we can interchange 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 267 


the order of the double integration. Let us first consider 


Gu) = [*_F@ cos w(t — 2) dt 


x a parameter and 0 < uw S a,afixed. Weshow that G(u) Is continuous. 
For » = po we have 


G(u) ~ Gluo) = [~" fOleos u(t — 2) — cos wolt — x)] dt 
+ [7 FOleos u(t — x) — COS po(t — x)] dt 


+ [/° #@leos w(t — 2) — cos wo(t — 2)] at 


The first integral is less than 2 | 7 |f(t)| dt, and the last integral is less 


than 2 [; | f(t)| dt. Since we are given that | 7 | f(t)| dt exists, a T can 


be found such that these integrals can each be made less than ge for any 
e>0. T is obtained after «is chosen. From 


cos u(t — x) — COs wo(t — x) = (hw — mo) sin Z(t — 2) 


a between po and py, we note that the middle integral is less than 


le — mol [FOI a 


Since [ _. |f(t)| dt = A is a finite number, we can find a 6 > 0s0 that the 


middle integral is also less than ¢/3 for |u — wo| < 6 < ¢/3A. Hence for 
any e > O there exists a 6 > 0 such that 


IG(u) — G(uo)| < 


for |w — pol < 6. This is the definition of continuity, so that G(y) is 
continuous at uw = wo and, in fact, for all ». Hence 


H(a) = fF du i. f(t) cos pw(é — x) dt 
, dH - anon 
exists, and aes f(t) cos w(t — x) dé. By considering 


K(a) = I. dt A * f(t) cos w(t — x) dy 


let the reader show that = = “ Since K(0) = H(O), of necessity 


[2,4 [710 cos ult — 2) du = [du [ f(0) cos u(t — 2) at 


3 


268 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Allowing a to tend to infinity enables us to rewrite (6.77) in the form given 
by (6.73). 
If f(x) is continuous at x, (6.73) may be written 


f(z) = : f da fp f(t) cos a(t — x) dt (6.78) 


From 
cos a(f — 2) = g(e*@@-) + e-teG-h)) 


one readily obtains 


fay = tf da f” seoemte-o at 
ax ei ez da [ fie dt (6.79) 


Equation (6.79) can be written in the symmetric form 


J 0 

f(x) = a E e'*=9(a) da 
1 eo 

g(a) = Jan :. e—=f(t) dt 


We say that g(a) is the Fourier transform of the real function f(z), and 
f(z) is called the inverse Fourier transform of g(a). 

A more rigorous treatment of the Fourier integral requires the use of 
the Lebesgue integral. 


(6.80) 


Example 6.6. We consider the function f(z) defined as follows: f(z) = Otorz <0 
and f(z) = e** forz 2 0,8 >0. Fourier’s integral certainly applies to this function, 


for f(z) has a simple discontinuity at z = 0, and i; 7 |f(x)| dz exists. From (6.79) 


~ 











im 1 = —1r1at,—Bt pa i 1 
g(a) ws ete Pt at oe oa 
] oo etaz 1 00 er ar 
fal = as | pa ~ eee 
_ wh © B cos ax + a Sin ar de 
ar }—o B? + ae 
- 2 [Oe ae 
x JO B? + a? 


From the definition of f(z) one has for z > 0 


1 ° B cos at + asin az 
Ca a ee ay 
x JO Bp? + a? 

1 f° Bcos at —asin az 


=f, B? + a? 


e7 bz = 


de 


ORTHOGJNAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 269 








so that eg Bt = i; eos ee da 
26 0 6B? + a? 
wT 8 © asin ax (6.81) 
ne Pe = da 
2 0 B? + a? 


Let the reader obtain (6.81) directly by contour integration. 
Example 6.7. Suppose f(x) is an even function. Equation (6.78) can be written 


I 


f(x) 2 I da fe f(t) cos at cos ax at 


= 2 i . CO8 axv ax is f(t) cos at dt (6.82) 
wr JO 0 


If we wish to solve the integral equation 


et = I cos at f(t) dt 
for f(z), we apply (6.82) and obtain 


f(x) = = f° ere e0s ax dr 
mr JO 
xe es 
wl +2? 


Notice that f(x) 1s an even function and that this function has the necessary properties 
for the application of the Fourier integral formula. 


“, and assume Z(p) such that Z(p)e = Z(iw)e. A 


agp” + aip™"1 + ++ + + an-1p + Gn, where p? = a etc, 


Example 6.8. Let p 


l 


2 
simple example is Z(p) di? 


We consider the equation 
Z(p)O.(f) = Ov) (6.83) 


We assume that the input 0,() 1s given, and we desire to find the output 6,(t) such 
that (6.83) will hold. Let H,(w) and H,(w) be the Fourier transforms of 0,(t) and 
Q,(t), respectively. Then 


H.(w) = = | © itu, (t) dt 
vie — 0 





Hy(w) = Age ie e~*“Oo(t) dé 
and 0,(t) = = iz eH, (w) dw 
Oo(t) = oe fo. e' Ho(w) dw 


Substituting into (6.83) yields 
Z(p) he et H o(w) dw = fc eH (w) dw 
Assuming that Z(p) may be placed inside the integral yields 


ie Z(iw)e"“H o(w) dw = is eH (w) dw 


270 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
Since the two integrals are equal for the full range of ¢, it appears logical that 
Z(iw)Ho(w) = H,(w) 
1 2 e'oH, (w) 
Hence 6.() = ——= [ —— d 
aa cl PR 1 C7) ala 

Z(p) is called the operational impedance, and its reciprocal, Y(p), is called the 

transfer function. We now have 


j L 
Q.(t) = —= e"#H,(w) VY (iw) d 
(¢) i (w) Y(tw) dw 
Making use of (6.84) yields 
Ol) = | 7 pteY (109) 1 ° O(ret# dr de 
T — © — oO 
= a fc [- O.(7) Yaw)e"*™ dr dw 
2r — 0 — 0 

ms [c @.(r)y(t — 7) dr (6.85) 

where y(t{—7r) = = fc Y (we dw 
wv — 0 


We have assumed that the order of integration could be interchanged. Equation 
(6.85) can be written 


6,(t) = lc. Ot aed (6.86) 


y(r) is appropriately called the memory function because of the term 6,(t — 7). Let 
the reader compare (6.86) with the particular solution of a linear differential equation 
with constant coefficients given by (5.71). 
No attempt at rigor has been made in obtaining the results of this example. 
Problems 


1. Suppose that f(z) of (6.74) is an even function. Show that 


sif(z +0) + f(z — 0)| = : I da [1 cos at cos ax dt z 20 
If f(r) 1s odd, show that 
sif(z + 0) + f(x — 0)] = * fr da fy 10 sin at sin ax dt z20 


2. Find the Fourier transform g(a) of the function f(x) defined as follows: f(r) = 0 
forz <0, f(z) = 1/afor0 Sz Sa,f(z) =Oforz >a. Find lim g(a). 


a—0 
3. Find the Fourier transform for f(z) defined as follows: f(z) = 0 for z <0, 
f(x) = e* for z 2 0. Show that 


0 forz <0 


©cosar + asin az v 
[, “Tra 4 5 forz = 0 
xe-* forz > 0 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 271] 


4. Consider g(a) = i : e* f’(¢) dt such that lim f(z) = 0. Show that 
— r—~ + 


g(a) = ta -. f(He*™ dt 


6.17. Nonlinear Differential Equations. We discuss the nonlinear 
differential equation 


Trt etas(s 4) =0 <1 (6.87) 


which can be written as the system 


dn _ 
aa” 

dy _ _ 

di L uf (2, y) 


(6.88) 


The solution of (6.87) for» = O0is x = A cos (¢ + B), 


Lae —A sin (t + B) 
The method of Kryloff-Bogoliuboff is to vary A and B in the hope that an 
approximate solution to (6.88) may be found. This is essentially the 
variation-of-parameter method discussed in Chap. 5. 


We write 
x=r7rcos@ 
y = —rsin 6 
and obtain 
dat = ad Ge r sin poe 
dt = dt at 
dy _— dr dé 
ra =; sin 6 r cos OF 
Equations (6.88) can now be written 
dr dé 
=; 008 8 — rsin 6 = —rsin 6 
— ot — pues = —rcos 6 — pf(r cos 6, —r sin 6) 
dt dt rs 
so that a = p sin 6f(r cos 6, —r sin 6) 
dt 
dé (6.89) 


ia = —r — » cos 6f(r cos 6, —rsin 6) w<K1 
For the range of values (r, 6) such that 


uf(r cos 6, r sin 0) <1 


272 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


one has to a first approximation 
dr 


—=(0 
dt 

do (6.90) 
dt 


Integration yields 6 = 6. + ¢ We replace 6 of (6.89) by #0 + / 80 that 


S = p sin (00 + t) f(r cos (00 +t), —r sin (60 + 2)) (6.91) 


Since oK« 1, the value of r will not change appreciably over the 


interval (¢, ¢ + 27). Integrating (6.91) on this basis [r is assumed con- 
stant on the right-hand side of (6.91)] yields 


r(t+2r)—r(t) =p a sin (@o+¢) f(r cos (@o0+2), —r sin (A0+¢)) dt 


7 ihe sin y f(r cos ¥, —r sin y) dy 
pF (r) (6.92) 
where F(r) = I * sin Vv f(r cos y, —r sin y) dy (6.93) 


The periodicity of sin ¥ f(r cos y, —r sin ) introduced the new limits of 
integration. Thus 
rt+2r)—r@) _ ow 
Qn Or P(r) 
Now “tt 2) = 1) 
2r 


fromttot + 27. Since 


represents the average change of r(¢) for ¢ ranging 


dr 


dt 
by a itself, at least to a first approximation. This yields 


< 1, the average change of r(¢) can be replaced 


: dru 
The integration of (6.94) yields the first approximation for r(t). The 


same method applied to the second equation of (6.89) yields 


dé 2 . 
= 1 + f cos ¥ f(r cos y, —r sin p) dy 
me eles 
eee Oar TM) 
and 6 = 00 + 1 - Hat) |t (6.95) 


with G(r) = I, ** cos Yf(r cosy, —rsin y) dy (6.96) 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 273 


Before applying Fourier-series methods to obtain improved approxima- 
tions, we discuss an example. 


Example 6.9. Van der Pol’s equation is 


2 
oF te — wl — ot) 2 =0 w<1 


dt? 
Here f(z, y) = —(1 — 2z?)y, and 
f(r cos ¥, —r sin y) = (1 — r? cos? y)r sin y 
2r 2 

so that F(r) = rf sin 2y(1 — r? cos? y) dy = ar (1 — 7) 
Equation (6.94) becomes 

a-5(-F 

dt 2 4 
whose solution is 

re Ht/2 





r(t) = (6.97) 


V1 + gro(e# — 1) 
Att = 0, 7r(0) = 7o. For ro ¥ 0 we have 
lim r(t) = 2 


t—> 00 
Since r? = xz? + y?, we note that to a first approximation the motion in the phase 


plane (x, y = a) resembles a spiral motion into the limit cycle circle x? + y? = r? = 4, 
From (6.95) we obtain G(r) = 0, so that @ = 6) + t, which yields nothing new. 


To obtain an improved approximation to (6.87) we proceed as follows: 
Let 


oO 


x(t) = r(t) cos 0(t) + par) + u > an(r) cos n6(E) 


n=2 


te . B.(r) sin n6(t) (6.98) 


n=2 


‘and let us attempt to make (6.98) a solution of (6.87). r(t) and @(é) will 
be taken as the first approximation given by the solution of (6.94) along 
with (6.95). We shall attempt to find a(r), an(r), Ba(r), n = 2, 3, 


4,.... Terms involving yp? will be ignored. From (6.98) we obtain 
pas ae na canes 
dt dt at . dt 


n=2 


oe 


+ ou » nB, Cos né ° (6.99) 


n=2 


244 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


as da _ , a dr _ ,F(r) da . : . 
The term to ee pe er is of the order of y»? and so is 


neglected. The same reasoning applies to the terms involving 


dan da,dr_ p dan 
Pa ard oe ae 
dB, dB,dr uw’ dBn 


edt)" ar dl 2x dr 


Differentiating again yields 


d?x dr do . de\’ ; de\’ 
az = 2 sin 6 — r cos 0 (2) ~ w n'a cos na (4) 


n=2 
_ 2 
— yp = n?B, sin né (#) (6.100) 
n=2 
d*r d*6 
Terms involving q2’ dB have been omitted since 
d’r wb ay, adr ; 
df gt ae MH P()F (r) 
d?6 rG’(r) — G(r) 
a~ (00 F(r) r #0 
dé ig 
Moreover (#) =]+ 77 a(r) 
drd@_ p 
didi ~ 2m”? 


when we neglect terms of the order of u?. Substituting x(t) of (6.98) and 


a of (6.100) into (6.87) yields 


pa(r) — ~ F(r) sin 0 — = G(r) cos 0 — p > (n? — 1)a, cos n6 


n=x2 


@ 


— pu » (n? — 1)8, sn n0@ = —ypf(r cos 6, —rsin 6) (6.101) 
n=2 
In the term uf(z, z) we have replaced x by r cos 6, ¢ by —r sin 6, again 
neglecting terms of the order of y?. 
Since the left-hand side of (6.101) is a Fourier series, we expand the 
right-hand side of (6.101) in a Fourier series, 


f(r cos 6, —r sin 6) = ao(r) + > a,(r) cos né + > b,(r) sin n6 


n=] n=] 


ORTHOGONAL POLYNOMIALS, FOURIER SERIES, FOURIER INTEGRALS 275 


1 on 
with ao(r) = on i f(r cos 6, —r sin @) dé 
0 


2n 
a,(r) ay cos n6 f(r cos 0, —7r sin 6) dé n=1,2,3,... 
0 


b,(r) 


2x 
- | sin n6 f(r cos 6, —r sin 6) dé 
0 
Equating coefficients of cos n@ and sin né in (6.101) yields 
27 
F(r) = | sin 6 f(r cos 6, —r sin 6) dé 
0 
2n 
G(r) = | cos 6 f(r cos 6, —r sin 6) dé 
0 


a(r) 


2n 
= = | f(r cos 6, —rsin 6) dé (6.102) 
0 


1 o 
a,(r) = serie | Cos nO f(r COs 6, —?T sin 6) dé gles ; j 


1 2x 
B.(r) = =) i sin n6 f(r cos 6, —r sin 6) dé 


The values of F(r), G(r) given by (6.102) agree with their previous 
definitions. It can be shown that x(t) given by (6.98) satisfies (6.87) 
with accuracy of order nu? for0 Si < om. 


Example 6.10. In Example 6.9, f(z, z) = (x? — 1)z, so that 


3 3 
f(r cos 8, —r sin 0) = (1 — 7? cos? @)r sin 6 = ¢ — =) sin 6 — = sin 38 


Applying (6.102) yields 


a(r) = 0 
an(r) = 0 r= 2; 3, 4, 7 
r3 
Bs(r) = — 35 
Bn(r) = 0 n=2,4,5,... 


Thus the improved first approximation is 


x(t) = r(t) cos (00 +¢) —u s 


with r(t) given by (6.97). 


sin 3(4 + t) 


Problems 


1. The differential equation of a simple pendulum is § + (g/l) sin 6 = 0. Take 
sin 6 = @ — 6*/6, and show that the period of oscillation depends on the square of the 
amplitude. 

2. Solve x +z + uwt|z| = 0, » <1, for the improved first approximation. 

8. Solve 4 + 27 + u(sign z)z = 0, uw < 1, for the improved first approximation. 


276 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


4. Solve x +2 + p(—a@ + Be?)x = 0, » K 1, for the improved first approximation. 
Show that a limit cycle occurs and that the radius of this limit cycle is ro = ~W/4a/38. 
5. Consider the differential equation 





‘zc +ét+ ux? =0 p< (6.103) 
written as the system 
dz _ 
ai” 4 
oy = 2 (6.104) 
dz : 
oo ee 


such that, at ¢ = tb, x = xo, ¥ = Yo, 2 = z. The solution of x 4-2 = O18 


r(t) = A+ Becos (t + C) 


l 
y= a = —-Bsn(t+C), z= 7 = —Beos(€+C) This’ suggests the trans- 
formation 


x=w—)7 cos 6 
y =7 sin 6 
zZ r cos @ 


I 


in an attempt to find an approximate solution of (6104) Let the reader show that to 
a first approximation 


@6=t+ 
OP oii 
dg OF 
dw ( ee 5) 
aU AM 9 
: = : | a‘ =. ) 2 9 9 
and w(t) = + = va eens with (ro + zo)? =7- , a reo ye + 22 
2 r 4 or 
REFERENCES 


Carslaw, H. 8.: ‘Introduction to the Fourier’s Series and Integrals,’”? Dover Publica- 
tions, Inc., New York, 1980. 

Churchill, R. V.: ‘Fourier Series and Boundary Value Problems,’ McGraw-Hill 
Book Company, Inc , New York, 1941. 

vourant, R., and D. Hilbert: ‘“Methoden der Mathematisehen Physik,” Springer- 
Verlag OHG, Berlin, 1931. 

Franklin, P.: ‘Fourier Methods,’ McGraw-Hill Book Company, Inc., New York, 
1949, 

Minorski, N.: “Introduction to Nonlinear Mechanics,” J W. Edwards, Publisher, 
Inc., Ann Arbor, Mich., 1947. 

Sneddon, I. N.: “Fourier Transforms,’? McGraw-Hill Book Company, Inc., New 
York, 1951. 

Stoker, J J.: ‘Nonlinear Vibrations,’”’ Interscience Publishers, Inc , New York, 1950. 

Szego, G.: Orthogonal Polynomials, American Mathematical Socrety Colloquium, 
vol. 23, 1939. 


CHAPTER 7 


THE STIELTJES INTEGRAL, LAPLACE TRANSFORM, 
AND CALCULUS OF VARIATIONS 


7.1. Functions of Bounded Variation. In an attempt to define arc 
length of a curve one is led to consider functions of bounded variation. 
et us consider a simple curve, I, given paraimetrically by « = f(d), 
y = g(t),a Sis B. Two distinct values of ¢ are assumed to yield two 
distinct points on T, so that as ¢ varies from a to 6, the point P with 
coordinates x = f(t), y = g(t) moves continuously from one end of the 
curve to the other without retracing its motion. We now subdivide the 
interval (a, 8) into 


a=t<t<l< ~ we < th <o legi < naar < ly <& = B 


For t = t we have the point P, with coordinates x, = f(t), yx = ote), 
P,onT, hk =0,1,2,...,n. The straight-line distance from P,_; to 
P18 given by 


{[f(te) — f(te-a)]? + [e(te) — e(te-1)]?}3 
The sum total of these straight-line ares is 


n 


Sa = Y tLflts) — FG)? + fo) — elt) PF} (7.1) 


k=1 


Now if for all manner of subdivisions of a < ¢t < B a constant A exists 
such that 


YG) — fa) <A 
k=] 


nm 


Y lel) — ol) < A 
k=1 


(7.2) 


then S, < \/2 A. Since the set of numbers {S,} is bounded, of neces- 
sity, a least upper bound (supremum), L, will exist for this set. We 
define L as the length of arc of T. For a discussion of the supremum see 
Chap. 10. Conversely, one easily shows that if the {S,} are bounded for 


any 


278 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


all subdivisions then a constant .1 exists satisfying (7.2) for all sub- 
divisions. The inequality of (7.2) leads to the following definition: 

Let f(z) be defined on the bounded interval a < x < b. If a constant 
A exists such that for all possible finite subdivisions of (a, b) into 


To = O< 41 Tom to ee et Sey <tr, =O 
we have 


d (te) — fr | < A (7.3) 


we say that f(x) is of bounded variation ona Sx &S b. 

A bounded monotonic nondecreasing or nonincreasing function 1s 
always of bounded variation. If f(z) is a monotonic nondecreasing fune- 
tion, then 


n 


> |fex) — fal = S [fee — fle] = £0) -— f@) 


k=1 k=] 


Thus (7.3) is satisfied for A > f(b) — f(a). Another example of a fune- 
tion of bounded variation is the following: Assume f(7) has a continuous 
derivative fora Sab. Then 


f(g) — S(te-a)| = f(t, — waa) f’(&e) | <M (atk — te 1) 


since |f’(z)| < Mfora Sx <b. Thus 


d, Wile) — fa] < MY lee — eu] = M(b - a) 
k=] k=] 


which proves our statement. 

Continuity of f(x) 1s not sufficient to guarantee that f(x) is of bounded 
variation. For example, consider f(z) = x sin (l/r), x # 0, f(0) = 0, 
0 <a S 2/r. Since 


2 Sima aa io) 
w eo 


lim f(x) = lim x sin 
x0 z—0 
f(z) is continuous at x = 0. Moreover f(z) is easily seen to be continuous 
if 2%. 2 Z 2 
for x ~ 0. Let us subdivide (0 2) into (0, On ie mt? 2), 
Then 
> Mes) — se 1)| we ltet+etiol: ee 
° 1 ae 2n +1 


k=1 


THE STIELTJES INTEGRAL 279 


00 


Since z 9k I diverges, no constant A exists satisfying (7.3) for all 
k=0 
modes of subdivision. 

A fundamental result concerning functions of bounded variation is the 
following theorem: 

A necessary and sufficient condition that f(r) be of bounded variation 
on a <x <b is that f(x) be written as the difference of two positive 
monotonic nondecreasing functions. That the condition is sufficient 
follows almost immediately from previous considerations concerning 
monotonic nondecreasing functions. Now let us assume that f(r) is of 
bounded variation on (a, b). Let x be any number on the interval (a, b). 
Let us subdivide (a, x) into 


b= 2p << 71 << he mS Ra ss a TS 
Then 


ees > flan) — fltea)| < A (7.4) 


k=1 


Some of the terms of S, are such that f(a,) 2 f(a), whereas other terms 
are such that f(z.) < f(ts-1). We write 


Sn pa i + Nn 


where P,, is the sum of terms of S, for which f(z,) 2 f(z,—1) and N,, 1s the 
sum of terms of S, for which f(x.) < f(%_—1). One easily shows that 
P, — Nn, = f(x) — f(a) so that 


Sn = f(z) — f(a) + 2N, 
Sn = —f() + f(a) + 2P, 


Since S, < A for all methods of subdividing the interval (a, x), we know 
from real-variable theory (see Chap. 10) that a least upper bound exists 
for the set {S,}. We call this least upper bound, V(a, x), the total 
variation of f(z) on the interval (a, xz). The suprema (least upper 
bounds) of {N,} and {P,} are written as N(a, x), P(a, x), respectively. 
From (7.5) we have 


V (a, x) = f(z) = f(a) a 2N (a, x) (7 6) 
V(a, x) = —f(x) + f(a) + 2P(a, z) 
so that f(x) = f(a) + P(a, x) — N(a, x) (7.7) 


(7.5) 


From the very definitions of P(a, x), N(a, x) we note that they are 
monotonic nondecreasing functions of x [see (7.6)]. f(a) + P(a, xz) is 
monotonic nondecreasing, which proves the theorem. 


280 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problem 1. Let f(x) = sin x for O S$ x S$ 2x. White f(z) as the difference of 
two monotonic nondecreasing functions of x for this interval. 

Problem 2. If f(x) is of bounded variation and continuous for a S$ z S b, show 
that P(a, xz) and N(a, x) are continuous fora Sz S b. 


7.2. The Stieltjes Integral. An important generalization of the Rie- 
mann integral is the Stieltjes integral. The Stieltjes integral is defined 
as follows: Let f(z) and g(x) be real-valued functions of the real variable 
xyfora sxx b. Subdivide the interval (a, b) into 


OS 25 2 eT Deg ep RO GP sO 


and Jet 6 be largest of the numbers x, — 2,1, k = 1,2, ...,n. Now 
form the sum 


Sn =) S(E)Ig(re) — gana) (7.8) 
k=1 


where & is any number such that x,-, S & S x, If lim S, exists 


n— © 


independent of the choice of the & and the method of subdivision, pro- 
vided 6 — 0, we call this limit the Stieltjes integral of f(x) relative to g(z) 
on (a, b), written 


[? S@) ag@) (7.9) 


In the special case g(x) = x, (7.9) reduces to the Riemann integral. 
If f(z) is bounded and if g(x) is a bounded monotonic nondecreasing 
function of z, then S, of (7.8) satisfies the following inequality, 


n 


mig(b) — g(a)} S$ ) milg(a) — g(er)] 
k=] 


17. 


n 


Sn =) Milg(es) — g(te+)] S Mlg(b) — g(a)] 


k=1 


IIA 


where m, is the infemum (greatest lower bound) of f(x) for 4%) S x < 
tr-1, M;, is the supremum (least upper bound) of f(x) for v1 S x S x, 


m and M are the infemum and supremum of f(x), respectively, for 
nr 


asxxxsb. The supremum of the sums > milg(a.) — g(az-1)] can be 
k=1 
called the lower Darboux-Stieltjes integral, L, and the infemum of the 


sums Mi[g(x.) — g(ae-1)] can be called the upper Darboux-Stieltjes 
k=1 


THE STIELTJES INTEGRAL 281 


integral, U. If these two integrals are equal, we say that the Riemann- 
Stieltjes integral exists and write 


Lav = [ ” #(x) dg(zx) 


This definition is easily seen to be equivalent to the one given above. 

If f(x) is continuous in (a, b), it is a simple matter to prove that L = U. 
Now if g(z) is a function of bounded variation, it can be written as the 
difference of two monotonic nondecreasing functions. It follows imme- 
diately that (7.9) exists if f(z) is continuous and g(x) is of bounded 
variation. 


Example7.1. Let f(x) becontinuousfor0 S$ z S l,andletg(z) = Ofor0 Sa < 3) 
g(z) = 1forg Sx S31. Forany subdivision not containing x = 5 we have dg(x) = 0. 
The subdivision covering z = ¢ yields dg(z) = 1. Thus S, = f(), where £ is any 


1 
number near z = > Since lm f(&) = f(s), we have f f(x) dg(xr) = f) 


nN— 


Problem 8. Let f(z) = 7 +34, g(z) = z?forO0 Sz 1 Show that 
1 1 
i: f(x) dg(r) = %& 
Problem 4. If f(z) and g’(r) are continuous for a S x S b, show that 


b b 
JP 1) dgia) = : {Dy @) ae 


The last integral is a Riemann integral 

Problem 6. If f(z) and g(x) have a common pomt of discontinuity, show that the 
Stieltjes integral of f(x) relative to g(x) does not exist provided the range of integration 
covers the point of discontinuity. 

Problem 6. If h(x) is nondecreasing, f(z) and g(x) continuous with f(x) 2 g(x), 


show that [ f(z) dh(z) 2 i g(x) dh(x). 


Problem 7. Let S, gE) [f(te) — f(t], teri S & Sa. Show that 


I 


k=] 
n—-1 
Sn = gléf) = gledfla) =) fladlgléess) ~ 960) 
k=1 


with zo = a, x, = 6b. Assume g(r) of bounded variation, f(z) continuous, and g(x) 
continuous at z = a and x = b. Show that 


PP oe) apte) = gy) = gaga — [ Fe) dye 


Problem 8. Let f(x) = g(x) + 2h(x) be continuous for a S$ z S b, and assume 
a(x) = B(x) + iy(z) to be of bounded variation. Show that 


f. "fGyaa = ih eG dey f hyde as : aGyavGy as i: ° n(x) d8(2) 


282 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problem 9. Consider the sequence of continuous functions, f,(7), 7 = 1,2, ... 
90 


Assume fa(z) converges uniformly for a S z S b, and let g(x) be of bounded 


n=] 
variation for this interval. Show that 


[ y fr(x) dg(x) = y f : fa(x) dg(x) 
n=1 : 


=] 


Problem 10. HJow would you define [ f(a) dg(x), and under what conditions 
a 


would this integral exist? 


7.3. The Laplace Transform. Let g(¢) be a complex function of the 
real variable ¢ defined fort 2 0. Let g(¢) be of bounded variation on the 
finite interval0 S$ ¢s Rk, Rarbitrary. The function e~* withz = x 4+ vy 
is continuous for all ¢, so that the integral 


[oe ago (7.10) 
exists for all complex z. It may be that, for a given value of 2, 
lim [* e-# dg(t) (7.11) 
R- «© 0 


exists. If this is the case, we write 
i ° et dg(t) = lim f * os dg(t) (7.12) 
0 Row JO 


Equation (7.12) is called an improper integral, and the right-hand side of 
(7.12) is called the Cauchy value of the improper integral. 
Those values of z for which (7.12) exists define a function of z, written 


f@) = [> dg (7.13) 

f(z) is called the Laplace-Stieltjes transform of g(t). 
We consider now the region of z for which f(z) of (7.13) exists. First 
we investigate three special cases. Let g(¢) = i ‘eu du, so that g(t) is 


monotonic increasing, and hence is of bounded variation for0 <7? s R, 
R arbitrary. Since dg(t) = e% dt, we have 


R R . fF : 
fi en (etinige dt = I, e@ = eos yt dt — 1 i e@ 7 sin yt dt 


Since e' — xt increases beyond bound for any fixed xz, we leave it to the 


reader to show that lim 7 ete’ dt fails to exist for all z. As a second 


R- «© 


THE LAPLACE TRANSFORM 283 


example, let g(é) = ji ‘e-“du. Let the reader show that lim i: * onat dg(t) 
R— 
exists for all z. Finally, let g(¢) = ¢, so that 





" a _ 1l-ee 1 
lim e# dg(t) = lim e~# di = lim — = 
0 R 0 Z 2 


R- 0 —> 00 R—- 


provided « = Rlz> 0. Hence Ir et dt = f(z) exists for Rl z > 0. 


- 0 
Let us consider now a general case. We assume that [ e* dg(t) 


exists, with zo = %) + 2Yyo. Further, let us assume that a constant A 
exists such that 


Lf ee dg(t | <A (7.14) 
foru 2 0. Define h(u) by the equation 
h(u) = i e-*" dg(t) (7.15) 
so that dh(w) = e~7“dg(u) and dg(t) = e dh(t). Then 
R R 
f e* dg(t) = i e201 dh(t) (7.16) 


Integration by parts (see Prob. 7, Sec. 7.2) yields 
R R R 
ie Cg) =e ere I c= dg(t) + (z — 2o) i A(t)e—@-#0# Qt 


If Riz > Rizo, then lim e~@-9R i * ¢- dg(t) = 0, since 
R 


—> 


lim e~¢-70k = Q) 


R— 
and | i i e~20t g(t) | < A for all R. Moreover | I, 7 h(t)e— Ot dtl < 
fr |h(t) |e")! dt << A/(x — ao), for x > x. Allowing R to become 
infinite in (7.16) vields 

i ce dg(t) = (2 — 20) fr h(t)e =)! dt 
for Rlz > Rl zo. Thus 

fe) = [5 dg (7.17) 


exists for Rl z > Rl zo. We have shown that, if (7.17) converges for 
Zo = Xo + tyo, then f(z) of (7.17) is well defined for z = x + ty provided 
t > 29. There may be singularities of f(z) on the line z = 2» + ty, 
—2 <y < o, and in the half plane Rl z < 2p. 


284 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The ordinary Laplace transform of f(é) is defined as 
F(z) = Lif] = f° sO at (7.18) 


provided the improper integral exists. If the Laplace transform of f’(é) 
also exists, one can integrate by parts to obtain 


LU’) = i: ” e-#f’(t) dt = e~*f (2) ; +2 f ” em#f(t) df 
Lif'(é)] = 2L{f()] — FO) 


lim e-#f(t) = 0 (7.20) 


t— 


Equation (7.20) will certainly hold if |f(¢)| is bounded for ¢ 2 0 and if 
Rl z> 0. Further application of (7.19) yields 


LIf"()] = 2Llf'O] — f') 
2e*L{f(t)] — 2f() — f'(0) (7.21) 


If F(z) is the Laplace transform of f(t), we say that f(¢) is the inverse 
Laplace transform of F(z), written f(f) = D7 [F(z)]. 


Example 7.2. We find the Laplace transform of sin at. We have 


(7.19) 
provided 


F(z) = I 7 e** sin at dt 


Integration by parts yields 


: t=R 
(—zsin at — a cos at) 
t= 


R 
fi e 4 gin at dt = 
0 =0 


ent 
a? + 2? 
If Rl z > 0, we have 

is e# sin at dt = an 
Table 7.1 lists Laplace transforms for some simple cases of f(¢). 


Problem 11. Derive the resulta of Table 7.1. 
Problem 12. Show that Lfaf(t) + bg()] = aL[f@®] + bL[g(d)]. 
Example 7.3. Let us consider the differential equation 


OEY OO oni 
dP 3 di 4y = 0 (7.22) 
subject to the initial conditions y(0) = yo, y’(0) = yy. Assuming that the solution 
of (7.22) and its derivatives are such that their Laplace transforms exist, we can apply 
(7.19) and (7.21) and the result of Prob. 12 to obtain 





2L[y(t)] — zyo — yy + 32bly(t)] — 8yo — 4L[y(2)] = 0 
@+3)yot yo yo-y% 1 4yo t+ yo 1 
so that L{y(t)] ae i er a ae A =F ae ee 
oes i 4 if 
= X=" Lex) + oats Liexlt)] 


THE LAPLACE TRANSFORM 285 


From Table 7.1 we see that ¢:(f) = e~“, g2(¢) = e*, so that the suggested solution of 
(7.22) is 

4yo+ yo, 
————— € 


5 (7.23) 


, 
y(t) = a ent 4 


One easily checks that y(¢) of (7.23) is the required solution. It is to be noted that the 
Laplace-transform method for solving (7.22) introduces the initial conditions in a 
natural manner. 









































TaBLE 7.1 
nO PQ) = f° Hse at 
] 

1 1 me Rl z > 0 
Bate ee er ee ee ee op 
2 ew ———, Rlz > Rla 

z—-ua 
3 sin at aa = 3 Riz > 0 
4 cos at 2 +» Rlz >0 
5 sinh at Z y Riz > fal 
6 cosh at Pop Rl z > |al 
Ets 
7 54 Sin at (22 [ pay? Rl z>0 
8 5 ; (sin at — at cos at) ae : FE aie Rl z > 0 
~/ 22 a n 
9 ye (t) Vie 2): 
V2? +1 
in) 1 
10 (n — 1)! an RI z>0 
. oy ou 
Problem 13. Solve = + 27 — 3z by the Laplace-transform method, 


y(0) = 0, y’0) = 1. 
Problem 14. Let f(t) = Ofor! <0. Show that 


I, " e-# f(t — 2) dt = i‘ ” e-#f(t — 2) dt = e-* I e-* f(t) dt 


provided the integrals exist. 


286 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


0? é 1 9? t 
Example 7.4. Let us attempt to solve the wave equation Sule, =5 Wa!) 





subject to the boundary conditions y(z, 0) = au.) = 0 for all x, and y(0, ¢) = f(t) 


fort 2 0. We assume y(o, ?) = lim y(a, t) = 0 for 20. Physically, we have 
a> 0 

an elastic string from x = 0 to x = »o, initially at rest. At the origin the string 1s 

constrained to move in such a manner that y(0, t) = f(4), f() a given function of time. 

If we multiply the wave equation by e~# and integrate from f = Oto = ©, we obtain 


oS fr u(r, Ce dt = z 22h ly(xr, t)| 
Or? 0 ] of y ( ey ce ~ 1] ’ 
; 4 ee ae apis fo OUD sy : oe 
provided we assume at i y(x, He dt = i. ae dt. Thus Lily] satisfies 
a? Zz? = 
5 ye Elul = 22 Hy! (7.24) 
A solution of (7.24) is 
0 
L{y] = [, y(z, the* dt = Aels/et 4+. Be~(e/c)z (7.25) 


where A and B can be functions of z._ Since y(z, t) tends to zero as x becomes infinite 
we choose A = 0. At x = 0 we need 


ie f(the-# dt = B@) 


From Prob. 14 we have 
a st = pa7ez/c 7 —st = = ae z gat 
i ula, De* dt = 6 I f(te-* dt I. f ¢ : dt (7.26) 


Equation (7.26) suggests that y(z, ?) = f(t — 2/c) fort 2 xz/c provided f(t — 2/c) = 0 
for t < 2/c (see Prob. 14). It is a simple matter to show that f(t — x/c) satisfies the 
wave equation and the boundary conditions. Of course one needs the fact that f(t) be 
twice differentiable. 

Example 7.5. We wish to find the function f(t) such that 


I. ” flen# dt = (2? — 1-4 (7.27) 
Let us assume that f(¢) has the Taylor-series expansion, , ant”. Without justifica- 
n=0 


tion let us assume that term-by-term integration is permissible. Hence 


“ 3 ‘ re Rp n! 
[sorta = Ya [oven a= Y Boa 
n= 


n = Q 
Now for {z| > 1 we know that 


1 7 1 2 (2m)! 1 
V2e—-1 2vV/1— (1/2)! : m\m|22™ z2m+1 
m= 


THE LAPLACE TRANSFORM 287 


Comparing the two Laurent series, we see that a, = 0 if n is odd and, if n = 2m, 


] 
on = mimioa so that 
10 = S int (5) (7.28) 
e mim! \2 
m=0 


— m 2m 
We note that f(t) = > (“i (5) = Jo(t), where Jo(t) is the Bessel function of 
m=Q 


order zero of the first kind. One can start with f(t) of (7.28), justify the interchange 
of integration and summation, and show that (7.27) results. 


Problems 
b d2y ; 
1. Solve dct + y =sin z with y(0) = 0, y’/(0) = 0. 
2 
2. Solve ot — dy = 3e?* with y(O0) = 0, y’(0) = 1. 
8. Solve 2x ou + SU = 2a, y(a, 0) = 1, ¥(0, t) = 1, for y(a, 0). 


Ans. y(z,t) =1+¢tfor0 <t < 2, y(z, t) = 1+ 2? fort > z?. 
4. By the inversion formula (see Sec. 7.4) solve for f(t) if 


2az 


wean fy ose a 


Ans. f(t) = ¢sin at (see Table 7.1). 
§. From Prob. 4 show that 


22 — g? 


i e~*"t cos at dt = (2? + at)? 


6. Find f(¢) such that 





—zt a é 
fe f0 dt = — 


zz? — 1 


7.4. The Inversion Theorem. Let g(¢) satisfy the requirements which 
enable one to write 


g(w) = = | - dv | 7 g(t) cos v(w — t) dt (7.29) 


(see Sec. 6.16). We assume that i: ° f(w) dw converges absolutely, and 
choose g(t) = e~“f(t) for ¢ 2 0, g(0) = Ofort < 0. Since 


0 = = [ dv i. g(t) sin v(w — ¢) dt 


g(w) = =} eww iv | g (tet dt 
] = e i s 
—2wF ay) = iow d —~2tf(t\ ett dt 
ef (w) a fe ofr f(te 


we have 


(7.30) 


288 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
Now assume 


F(z) = I ° e-*tf(t) dt 


exists for Rlz 22>0. Then 


i 7T (eeu dz = [ TF owe dy i. ” ef (t) df 


z—ty ~—ty 


wer ie ew? dy f 7 e-**f(t)e—”* dt (7.31) 


on letting z = x +7, v a new variable of integration. On letting y 
become infinite and comparing (7.31) with (7.30) we see that 


1 see (" wz 
f(w) = am Eo F(z)e dz (7.32) 
If F(z) is the Laplace transform of f(w), (7.382) enables one to find f(w) 
in terms of F(z). This is the Laplace-transform inversion theorem. 
Equation (7.32) can often be evaluated by the calculus of residues. 


Example 7.6. Let F(z) = 1/(2 4+ 1) 
Then from (7.32) 


zr+10 ev 


t-10 2+1 





fw) = 5 


We 





with z >0. Weconsider p : e dz around 


+1 

the path given in Fig. 7.1. Let the reader 
show that, as the radius of the semicircle 
becomes infinite, the integrals of ev?/(z + 1) 
tend to zero along RU, CDE, EA. The resi- 
due of e/(z + 1) atz = —lise’. Thus 


(i= = [Qrie-"] =e 


7.5. The Calculus of Variations. 
The calculus of variations owes its 
beginning to a problem proposed by 
Johann Bernoulli near the completion 
of the seventeenth century. Suppose 
two points to be fixed in a vertical 
plane. What curve joining these two points will be such that a particle 
sliding (without friction) down this curve under the influence of gravity 
will go from the upper to the lower point in a minimum of time? This is 
the problem of the brachistochrone (shortest time). A problem of a 
similar nature is the following: What curve joining two fixed points is such 
that its rotation about a fixed line will yield a minimum surface of revolu- 





Fia. 7.1 


THE CALCULUS OF VARIATIONS 289 


tion? Thisis the soap-film problem. A third problem asks the following 
question: What curve lying on a sphere and joining two fixed points of the 
sphere is such that the distance along the curve from one point to the 
other isa minimum? This is the problem of geodesics. 

Let us obtain a mathematical formulation of these problems. 

1. Brachistochrone. Let the particle P move along any curve given by 
y = y(x) (see Fig. 7.2). The speed of the particle is given by v? = 2gy 


or = 4/29 y. Hence 7 
1 ds 1 dz? + dy? 
4/29 y} a/2g y} 
| pate 
= —— ,|/———"—— dz 
a/29 Y 


The total time of descent is given by 
zo 6/} 4 (y,"\2 
ft See | fae a: 
a/ 29 Jo Y 


To solve the brachistochrone problem, one must find the function y(z) 
which makes (7.33) a minimum. 

2. Minimum Surface of Revolution. If the curve y(x) lying above the 
x axis is rotated about the x axis, the surface of revolution generated in 
this manner 1s 





(7.33) Fia. 7.2 


S = Qn / yds = 2m [Py Jl + (y')? dz (7.34) 


To solve the soap-film problem, one must find y(z) such that S of (7.34) 
is a Minimum. 
3. Geodesics of a Sphere. In a Euclidean space we have 


ds? = dx* + dy? + dz? 


For a sphere x = rsin cosy, y = rsin 6sin g, z = 7 cos 6, so that 


2 
ds? = r*(d@? + sin? 6 dg’) = r? E + sin? 6 (42) de. 


If ¢ = ¢(6) is any curve on the sphere, the distance between two points 
of the sphere joined by ¢ = ¢(@) is given by 


62 2 7 
L= rf ati + sin? o(%) dé (7.35) 
. ae 


290 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


To find the geodesics of a sphere, one attempts to find ¢ = ¢(@) such that 
L of (7.35) is @ minimum. 

Formulas (7.33) to (7.35) are special cases of a more general case. Let 
f(z, y, y’) be a function of the three variables xz, y, y’. We wish to deter- 
mine y = y(z), and hence y’ = y’(x), such that 


pe / Tau Uae 7.36) 


will be an extremal (minimum or maximum) subject to the restriction that 
y(ay) = 1, y(ate) = yo. For (7.33), (7.34), (7.35) we have 


2 [Leary 
Vig 4 


f(z, yy’) = 


fa, yy’) = 2 + YE fe, yy!) =7 Vi F sin? Gy’)? de with 
x= 060, y = ¢, respectively. 

Let us see how a problem in the calculus of variations differs from an 
extremal problem of the ordinary calculus. In the latter case we are 
given the function y = y(x). For each real x there corresponds a unique 
real number y. Thus y = y(x2) maps a set of real numbers into another 
set of real numbers. The relation y = 1/z, 0<2# 1, maps the 
interval 0 < x S 1 into the interval y 2 1. A simple problem in the 
ordinary calculus is to find a number zx which yields the minimum or 
maximum value of y = y(x). Now (7.36) may also be looked upon as a 
mapping. For any function y(z), (7.36) defines a real number J. Thus 
(7.36) is a mapping of a function space [the space of y(x)] into the real- 
number space. Our problem is to select a member of the function space 
which yields a minimum or maximum value for J of (7.36). To resolve 
this question, we reduce the problem to one of the ordinary calculus. 

7.6. The Euler-Lagrange Equation. JLet us assume that there exists a 
function y(x) which makes J of (7.36) an extremal. We now consider the 


class of functions 
Y(x, A) = y(z) + An(x) (7.37) 


where 7(x) is an arbitrary differentiable function such that n(z1) = 0, 
n(z2) = 0 and 2 is a real parameter. We have Y(x, A) = y(%1) = y1, 
Y(x2, 3) = y(r2) = yo. Ford = Owe have Y(2,0) = y(x). As we vary 
X and 7(xz), we obtain a family of curves passing through the two 
given points P,(x1, yi), Po(xe, yz). Moreover \ = 0 yields the desired 
curve, y = y(z). The value of J for any member of (7.37) is 


TA) = [-"f(2, y +a, y! + dy’) dx (7.38) 


For any fixed n(z) we know that (A) is an extremal ford = 0. From the 


THE CALCULUS OF VARIATIONS 291 





calculus, of necessity, ~ = (0. Assuming continuity of a and oo 
r=0 
we have 
dI of of 
oe _ + 
dr A =O E Gi : “ Us 








of 0 
[Pare + ay [E(B )aae 30 


upon integration by parts. Since n(71) = n(xe) = 0, we have 


dl | _ [| of of 
DEN hig I |e dx “(a )| meyes cm” 


It is a simple matter to show that, if Af7(z) 1s continuous on 2 S x S 2, 
then the vanishing of i ” M (x)n(a) dx for arbitrary n(x) implies M(x) = 0. 


We leave this as an exercise for the reader. Hence if y(x) is the required 
solution, of necessity, y(r) must satisfy the differential equation 


af \ _ af 
a2 ry oe (7.41) 


Equation (7.41) is the important Euler-Lagrange equation. It can be 

written as a second-order differential equation in the form 
O°f d* O°f di 0? 0 

ff d’y EY “f dy a t on, 


(dy)? dx? ' dy dy’ dx ' dxdy’ day 





(7.42) 


If we differentiate f — y’ of with respect to z along the curve y(z) 


oy’ 
satisfying (7.41), we obtain 
af a fy » Of dt) w af 
Or oy T 5 oe ay’ ~¥ e oy’ Ox 
by making use of (7.41). Thus 
, F\_ af 
As y 3) ae (7.43) 


for an extremal. Ii fis explicitly independent of y, one has a first integral 


of (7.41) given by o = constant. If f is explicitly independent of 2, 


a = 0 and (7.43) : the first integral, 


f-y oo = constant (7.44) 


292 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


One may obtain the Euler-Lagrange equation from the following point 
of view: First let us consider the equation y(«) = x — 22, so that 


y(a + rdx) = (x + Adz) — 2(x + dA dz) 


Differentiating with respect to » yields ov = 3(24 + A dx)? dx — 2 dz, so 


that Cue NG = (3x? — 2) dr = dy, the differential of y. Now 

=O 
we have seen that (7.36) can be looked upon as a mapping of a function 
space into a real-number space. We call this mapping a functional of y, 
written J = J[y]. The first variation, or differential, of / may be 


defined as 
Ily + A by) — Ty] 
r 





6f = lim 
rA-20 


_ dTly +> dy] 
an 


A=0 


(7.45) 





A=0 


where 6y(x) is any variation in y(x). Applying (7.45) to (7.36) yields 


I[y +d by] -{ f(x,y +d by, y’ +d by’) dx 
(7.46) 





al af 
Or A=0 7 E (2s my aa ay) a 
. v2 of 
Integrating ay? dy’ dx by parts yields 


_ of da faf 
= ES £(3 1) | ov ae 


provided dy(xi1) = dy(r%2) = 0+ Equation (7.41) holds, of necessity, if 
6f = 0 for arbitrary dy. 

In the calculus it is necessary to examine the second derivative of y(z), 
and, at times, higher derivatives, in order to determine the type of 
extremal (maximum, minimum, point of inflection) encountered at the 
point x = x; for which aw = Q. Similarly, in the calculus of variations 
it is necessary to examine the second variation in J to determine what 
type of extremal is obtained from the solution of the Euler-Lagrange 
equation. Let the reader show that the second variation of J can be 
written as 


er = | E J sy 2 ay by dy’ +m (oy) | as (7.47) 














THE CALCULUS OF VARIATIONS 293 


Example 7.7. Equations (7.33) and (7.34) are special cases of 
v2 Se ern rene 
[= i gy) V1 + (y’)? dx 


We have f = g(y) [1 + (y’)?]}, a = 0, so that a first integral is obtained from (7.44). 


Let the reader show that (7.44) yields 


oy) 
V1 + (y’)? 


= constant = a 


From tan 6 = at we have cos 6 = V/V1 + (y’)? so that 
gly) = asec 6 (7.48) 


Moreover g’(y) dy = a sec @ tan 6dé@ from (7.48), and dx = cot @dy so that 








asec 6dé@ 
a = — 
g'(y) 
yielding 
6 sec 6 dé 
maa i peas 7.49 
. T4 Jy @y) ay 


Given g(y), one solves (7 48) for y as a function of 6. Then integration of (7.49) yields 
xasafunction of 6. This parametric representation of x and y as functions of @ yields 
the required curve which extremalizes J. 

In the brachistochrone problem g(y) = y~}, so that y = c cos? 6 = (¢/2)(1 + cos 28), 
c = 1/a?, from (7.48). Thus g’(y) = —gy ? = (—2c? cos? @)~}, and (7.49) yields 


6 0 
(0) =b — 2c [ cost oda =b —c | (1 + cos 26) dé 


2(@) =b — 5 (26 + sin 26) (7.50) 


y(8) 3d + cos 26) 


Equation (7.50) is the parametric equation of a cycloid 
Example 7.8. Vartable-end-point Problem. We are given the fixed curve I, 
y = g(x), and the functional 


x2=b 
l= i F(z, y, y’) dx 
We wish to find the curve y = y(zx) joining the fixed point A(z, yi) and B(b, ¢(b)), 


where B is a point on I such that J is anextremal. The coordinate x = bis unknown. 
If we consider the curve y(xz) + dy(x), we have 


b+ 652 
I{y + dy] = ie F(z, y + by, y’ + by’) dz 


The upper limit has changed since the end point of y(z) + d5y(z) 1s constrained to lie 
on y = g(x). Let the reader show that 


b 
lly + sy] — Ily) = i: "IF (e, v + by yl + by!) — Fe, yy) de 
b+sz 
+ I, P(x, y + by, y + by’) de (7.51) 


294 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


From (7.51) it is logical to define 5/ by 
ar = [ (Foy + 5 by’) de + Fe, yy |, _, it (7.52) 
Now dy = Oatz =a. We must compute dy alr = b. We have 


y(b + bx) + dy(b + br) = v(b + Sx) 


y(b) = y(b). Thus y(b + br) — y(b) + dy(b + br) = o(b + dr) — g(b). Applying 
the law of the mean one easily shows that 


by = [y’(b) — y'(b)] bx 





Integrating the second part of the integral of (7.52) by parts yields 


bTor d (oeF oF, 
lee [. lS =a (S| eae, | F Hoye "| bx 


If 6J = O for arbitrary dy, of necessity 


ar 
| F +(¢' -y') Flan =0 (7.53) 


along with the Euler-Lagrange equation. 
Let us apply (7.53) to the problem of the minimum surface of revolution with ¢(z) 


OF 
arbitrary. We have F(z, y, y’) = y[l 4+ (y’)?]}, ae = yy'[1 + (y’)?]7}, so that (7.53) 
becomes 
fyll + Cy’)? + yy’[t + y')748@! — yan = 0 
or y’y’ = —latz =b. Hence, at their point of intersection, y(x) and g(x) intersect 
at right angles. 


Problems 





. Show that the solution of the soap-film problem is y = a cosh a 


. Show that the geodesics on a sphere are arcs of great circles. 
. Derive (7.47). . 
. Show that the Euler-Lagrange equation for the functional 


mon 


2 
iis [ "fly ay yl) de 


d? f of of Of | 
dr? (4) -¢ dx ay’ + ay 8 
5. Show that there are no extremals or stationary values for the functional 


re 
Ty] =| y dx 
ZX 


6. Consider the functional 


ts 
I{x1, Uy 2 + » En] 7 f F ar, 2, s+ + y In £1, £2, oe 8 » Tn, t) at 
1 


THE CALCULUS OF VARIATIONS 295 


Show that the Euler-Lagrange equations are 





of Of _ : 

= | 3° 40% 54 
dt 5 (32) ~ OL, ‘ » 2, a Cay 

Explain why }t is necessary that. 

f(r, Le, , Xn, AX, AL2, of yt! SON Pap a Ag te GN iedig ape an Sg ag) 


7. Apply (7 54) of Prob 6 to 
t ] 
Ilr, y| = ‘i lS (x2 + y?) — ma | dt 
i} 4 
8. In Chap. 3 we saw that Lagrange’s equations of motion could be written as 


= (= _ ob _9 
dt \aqa oq. 


Show that this implies that the functional / 


f2 . ° 
i L dt is an extremal for Newtonian 
fy 
motion. Fora conservative system, 7 + V = constant = Ah Show that Newtonian 
. t2 e 
motion for a conservative system is such that i 2T dt is an extremal subject to the 
1 


condition 7 + V = constant 


9. Consider the functional /[z] = I PF (z, Y, 2, =, x) dy dx, the region of 
4 


integration, S, having the simple closed curve C as its boundary. Show that, for 
I{z] to be an extremal, z = z(r, y) must satisfy 


oF oF rr 


ae Orep Oyeq 
oz Oz : 
with pe = ai in Ff 


7.7. The Problem of Constraints. In the ordinary calculus one solved 
problems of the following type: Given the function of two variables, 
z = f(z, y), at what point P(z, y) is z an extremal subject to the con- 
straint g(x, y) = constant? We know that if f(z, y) is continuous on 
the closed and bounded curve C , (x, y) = constant, there will exist points 
on C at which z takes on minimum and maximum values. One way to 
solve this problem is to solve for y from g(z, y) = constant, obtaining 


= y(x), and then to extremalize z = f(x, ¥(x)), a one-dimensional prob- 


lem. Another method is due to Lagrange. For an extremal, a = Q, 


’ dx 
of , of dy one 
SO that = aoe page Chis same equation can be obtained as follows: 
ow ow 
Consider w = f(z, y) + Ag(z, y), X a parameter. Compute — ar and ay 


as if x and y were independent variables. Setting these two partial 


296 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


derivatives equal to zero yields 


Ow _ of Op _ 
eee on 
ow of Oy 


7 or ay 


Eliminating A yields 
af, of — = 








dx | dy\ dp/dy 
Along the curve g(z, y) = constant, we have Oe O08 =. 0, so that 
£ 4 ~ d Y 3 d Ox oy dx ) 
a = “ = — segs and Lagrange’s method of multipliers yields 
q : 
Bee 0, the required equation for an extremal. 
ox =dydz 


The simplest constraint problem in the calculus of variations is the 
following: We wish to extremalize the functional 


Hy] = [° Sle, y, y') de (7.55) 
subject to the constraint 


f . g(x, y, y’) dx = constant (7.56) 


As in Sec 7.6 we let Y(a, A) = y(x) + Aami(x) + Aone(z), 
ni(a) = m(b) = n2(a) = n2(b) = 0 


and we desire to extremalize 
b 
1012) = [f(a y + dam + ans, y! + dant + Aan) dz 


subject to the condition 


~ 


b 
J (Ai, Ae) = i: e(z, y + Aim + Aone, y’ + ini + A275) dx = constant 


1(\1, 2) 18 to be an extremal at \; = Az = 0. 
Let the reader show that the Euler-Lagrange equation becomes 


£ 2 (f + 2) | — FF + re) = 0 (7.57) 


The solution y(z, \) of (7.57) is substituted into (7.56) to eliminate i. 
Example 7.9. Let us find y(z) such that the functional I[y] = if : y dz ig an 


a es eee 
extremal subject to the constraint I V1 + (y’)? dz = constant = b. Since the 


constraint states that the length of are of the curve u(x) be a constant. this tvpe of 


THE CALCULUS OF VARIATIONS 297 
problem is called an isoperimetric problem. We have f + 4p = y tA V1 4 (y’)? 
AY , 


Oy’ + = 
pf + ¥e) VI +? 


pe 


so that (7.57) yields 





aA wara ae, 


or y’/{1 + (y’)}2 = 1/k. We recall that y’’/{1 + (y’)?]} 1s the curvature, so that 
y(z) has constant curvature, and hence must be an arc of a circle. The radius of the 


b. 


a Mir) Sek 25S, ps Oe 
circle is 4, which can be obtained from i V1 + (y')2 da 
Example 7.10. If the Riemannian metric (see Sec. 3 6) 1s extremalized subject to 


8) ee" 
the constraint i Cae et sib: x") ~ ee _ ds, where ¢ga, a = 1, 2, 3, 4, 1s the electro- 
0 


magnetic vector potential, one obtains - motion of a charged particle in a gravita- 
tional and electromagnetic field, 


d?x* , da! an axe € 
ds? a ey ds ote m ao ds 0, Ke m 
Problems 


b 
1. Find y(z) which extremalizes the functional /{y] = I (y’)? dx subject to the 
a 


b 
constraint f zy dx = constant. 
a 


2. Find the curve of constant length Joining two fixed points with the lowest center 
of mass. 
8. The area underneath the curve y = f(r) from x = a to z = bis rotated about 


b 
the x axis. Find the maximum volume obtained for I y dx = constant = c. 
a 


4. Derive (7.57). 


REFERENCES 


Bliss, G. A.: “‘Caleulus of Variations and Multiple Integrals,” University of Chicago 
Press, Chicago, 1938. 

Carslaw, H. 8.: ‘Conduction of Heat in Solids,” Oxford University Press, New York, 
1947. 

Churchill, R. V.: ‘Modern Operational Mathematics in Engineering,’”’ McGraw-Hill 
Book Company, Inc., New York, 1944. 

Doetsche, G.: ‘Handbuch der Laplace Transformation,’”’ Birkhauser, Basel, 1950. 

Jaeger, J. C.: “An Introduction to the Laplace Transformation,’ Methuen & Co.., 
Ltd., London, 1949 

Weinstock, R.: “Calculus of Variations,’’ McGraw-Hill Book Company, Inc., New 
York, 1952. 


CHAPTER 8 


GROUP THEORY AND ALGEBRAIC EQUATIONS 


8.1. Introduction. The study of groups owes its beginning in an 
attempt to solve algebraic equations of degree higher than 4. Thelinear 
equation ax + b = 0, a ~ 0, has for its solution x = —b/a. The solu- 
tion of the quadratic equation az? + be +c = 0, a ¥ 0, is known to be 
2, = (1/2a)(—b + Vb? — 4ac), v2 = (1/2a)(—b — vb? — 4ac). We 
note that x; and z-2 are written in terms of a finite number of operations 
involving addition, subtraction, multiplication, division, and root extrac- 
tions. The operations are performed on the coefficients of the quad- 
ratic equation. We say that the quadratic equation is solvable by radi- 
cals. The cubic and quartic equations are solvable also by radicals. 
Lagrange attempted to extend this result to algebraic equations of degree 
higher than 4. He was unsuccessful, but his work laid the foundation 
which enabled Galois and Abel in the early part of the nineteenth century 
to grapple successfully with this problem. In general, an algebraic equa- 
tion of degree higher than 4 cannot be solved by radicals. The equation 
x’ — 1 = 0 can be solved by radicals, however. It remained for Cauchy 
to systematically begin the study of group theory proper. The theory of 
groups plays an outstanding role in the unification of mathematics. Its 
applications in mathematics are widespread, and, moreover, it has served 
an important role in the development of the modern quantum theory of 
physics. 

8.2. Definition of a Group.. We consider first some elementary exam- 
ples of groups. Let us consider the set of all rationals, excluding zero, 
subject to the rule of multiplication. We note that the product of two 
rationals is again a rational. If a, b, c are rationals, then the associative 
law, (ab)c = a(be), holds. The unique rational 1 has the property that 
l1-a =a:1 =a, for all rationals a. Finally, for any rational a, there 
exists a unique rational, 1/a = a~'!, such that aa-'! = a~'a = 1. Let the 
reader show that the four elements (1, —1, 7, —72) possess these same 
properties under the operation of multiplication. Let us consider the 
90°, 180°, 270°, and 360° = 0° rotations in a plane about a fixed point. 
Let us denote these rotations by Ai, As, As, and A, = E, respectively. 
By A2Ai we mean a 90° rotation followed by a 180° rotation, etc. We 

298 


GROUP THEORY AND ALGEBRAIC EQUATIONS 299 


note that A,;A; = Ax. A 270° rotation followed by a 180° rotation is 
equivalent to a 450° = 90° rotation. Moreover A;E = EA; = A, for 
t= 1, 2,3, 4. E is the identity element in the sense that the rotatién 
E leaves a body invariant. We also note that A,A; = 4;A; = E, 
A,A, = E, EE = E, so that every element has a unique inverse. From 
the fact that 


etfietis — et(Or+Gs) et = 1, ele/2)e 1, emt = —1, e(8n/2)¢ — 9 


let the reader deduce a correspondence between the rotations discussed 
above and the four elements (1, —1, 7, —7) under multiplication. The 
examples above lead us to the formal definition of a group. A discussion 
of sets can be found in Sec. 10.7. 

Let G consist of a set of objects A, B, C,.... An operator, ®, is 
associated with every pair of elements of G, ®(A, B) = A ® B. For con- 
venience we call the operator ® multiplication and write A @ B = AB. 
The set G is said to be a group relative to the operator ® if: 

I. For every A and B of G, AB = C implies C is a member of G. This 
is the closure property under @. 

II. For all A, B, and (' of G, 


(AB)C = A(BC) 


This is the associative law. 
III. There exists a unique element, E, of G, such that 


AE =EA=A 


for all AinG. E is called the identity, or unit, element. 
IV. For every A of G there exists a unique element, written A~!, such 
that 
AA1=A™*A=E 


‘We call A-! the inverse of A. It follows that A is the inverse of A-', 

One can replace III and IV by: 

III’. For every A and B of G there exist unique X and Y of G such that 
AX =B,YA =B. 

By choosing B = A we see that every element A has a unique right 
and left identity, from (III’). Thus AH, = A, E:A = A. We show 
now that E, = BE. Now A(E2A) = AA = (AE;)A, 80 that A = AE, 
from (III’). Hence E, = HE, from (III'). We show next that the iden- 
tity element for A is the same as that for B for all A and B. We have 
AE, = E,A = A, BEz = ExpB = B. Now B(EpA) = (BEs)A = BA, 
so that, from (III’), HzA = A, which implies Hs = E,. Let the reader 
show that (III’) implies (IV). 


300 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Example 8.1. Let x and y be elements of a set such that x? = e, y3 = e, yry = x. 
We consider the set of elements (z, y, y? = y- y, zy, zy’, e), e the unit element. Let 
us construct a multiplication table for theseelements. Wewritez-z2 = z?,y-y = y’, 
yyy = y3,etc. From 2? = e we note that z is its own inverse, and from y® = e we 
note that y? is the inverse of y. If we desire to compute (ry?)z, we note that 


(zy?)z = (zy*)(yzy) = zy®xy = very = rry = Ty = ey = y 


(see Table 8.1). 
We note that each row and column of Table 8.1 contains the six elements of our set 


with no repetitions. If z # e, y # e, let the reader show that the six elements are 
distinct, and hence form a group. 









































TaBLeE 8.1 
e r y y? zy ry? 
Tha Ee: | x y eet ry ry? 
ade | oe | w law | yy | oe 
voy fw | ow foe | oe | ay 
vpe | aw foe | oy | awe | a 
af | ew fre fos [oe |g 
yt | at | oy | 2 | w | # |e 
Problems 


1. Verify Table 8.1. 

2. Consider the set of square matrices, |la.,||, 7, 7 = 1, 2,..., n, such that 
la,,| + 0. Show that this set of matrices is a group under multiplication. 

8. An Abelian group is one for which AB = BA forall A and BofG. Is the group 
of Example 8.1 an Abelian group? Show that the group of Prob. 2 is non-Abelian. 

4. A finite group is one containing a finite number of distinct elements. Show that 
we can replace (III) and (IV) by (III): AB = AC implies B = C, and BA = CA 
implies B = C, for finite groups. 

5. We define a-a = a?,a-a-a~—a',etc. A group is said to be cyclic if a single 
element generates every element of the group, that is, an element a exists such that, 
if x is any element of the group, then x = a” for some positive integer n. Give an 
example of a cyclic group. 

6. Show that A = (A7!)71, 

7. If A and B are elements of a group G, show that (AB)~! = B-!A-!. Generalize 
this result. 

8. Show that the set of rational integers (positive and negative integers including 
zero) form a group relative to the operation of addition. Do the set of rational 
integers form a group relative to the operation of multiplication? 


8.3. Finite Groups. A finite group is a group consisting of a finite 
number of distinct elements. We deduce now some theorems concern- 
ing finite groups and illustrate each theorem with an example. 


GROUP THEORY AND ALGEBRAIC EQUATIONS 301 


THEOREM 8.1. The order of a subgroup 1s a divisor of the order of the 
complete group. ‘The order of a group is the number of distinct elements 
of the group. #H is a subgroup of G if H is a group and if, furthermore, 
every element of H belongs toG. If at least one member of G is not a 
member of H, we say that H is a proper subgroup of G. The proof of 
Theorem 8.1 is as follows: Let H consist of the elements E, A, B, ... ,F. 
If H =G, there is no problem. Assume H a proper subgroup of G, and 
let X be any element of G not in H. Construct the set H, of elements 
XE, XA, XB, ...,XF. Let the reader show that these elements are 
distinct. Moreover, if XA = B, then X = BA™-! is a member of H 
since H is a group. But X is not a member of H so that XA + B. 
Hence every element of H, is distinct from every element of H. If the 
members of H and H, exhaust G, then g = 2h, where g is the order of G 
and h is the order of H. If this is not the case, let Y be a member of 
G not in H or H,. We now construct the set H,. consisting of YE, 
YA,..., YF. One easily shows that the members of H>2 are distinct 
from each other and are distinct from the elements of H and H/,. If 
H, H,, He exhaust G, then g = 3h. If not, we continue in the same 
manner. Eventually we must exhaust G since G has a finite number of 
elements. Thus g = nh, and h divides g. 


Example 8.2. The group of Table 8.1 consists of six elements. A subgroup of this 
group is H(z, x? = e). Theorder of His2,and6 = 2 38. Another proper subgroup 
of Gis K(y, y?, y3 = e),6 = 3-2. Is it possible for G to have a subgroup of order 4? 


THEOREM 8.2. very subgroup of a cyclic group is a cyclic group. The 
definition of a cyclie group is given in Prob. 5, Sec. 8.2. Let G consist of 
A, A’,...,A%’= EH. Let H bea proper subgroup of G with elements 


A AOS oo ely ase yo A SD b<b< +--+ <r 
Since b; > b, we have b} = gb +8,0 58 <b. Then 
A® a A vot s — Aw As 


and A-% = A*, If s #0, then A* is a member of H, a contradiction, 
since b was assumed to be the smallest exponent of A such that A? is 
in H. Hence s = 0, and bi = qb = 2b, since A°A> = Aisin H. The 
only elements of H are of the form A™, so that H is cyclic. 


Example 8.3. Let G be a cyclic group of order 8, so that 
(a, a?, a’, a4, a5, a®, a’, a® = e) 


are the elements of G. Consider the subgroup H (a?, a‘, a’, a8 = e). Wenote that H 
is cyclic since a4 = (a?)?, a8 = (a?)3, a8 = (a?)4. 


302 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


e 


THEOREM 8.3. A criterion for a subgroup is the following: Let G be 
a finite group, and let S be a subset of G such that the product of any 
two elements of S is again an element of S. Then S is a subgroup of G. 

Certainly the closure and associative properties hold for S. Now 
let A be any element of S. Then A, A*,..., A’,... belong to S. 
There can exist only a finite number of distinct elements of the type A’*. 
Thus A® = A™ for n >m, and A*-" = E belongs to S. Moreover 
AA»! = E, so that A™-”—-! = A-' belongs to S.. Q.E.D. 


Example 8.4. Let us consider the permutations of the integers (1, 2, 3). We 
obtain the six permutations (1, 2, 3), (1, 38, 2), (2, 1, 3), (2, 8, 1), (8, 1, 2), (38, 2, 1). 
We can consider the particular permutation (2, 3, 1) as being obtamed from a sub- 
stitution of the integers 1, 2, 3, in the sense that 1— 2,2 + 3,3— 1, wntten GC 


In this way we obtain the six elements 


) = (2 = Ge a ‘Ca vse fU2B YS. Bec: (225) 
"4 ae 2 132) 3 = Nos) 8 = Negi) 88 = aie) % = Ager 


If we consider a triangle with vertices labeled 1, 2, 3, respectively, then S¢ states that 
we interchange the labels 1 and 38 and leave label 2 invariant. Let us consider any 
function of three variables, f(.11, x2, 73) The operation of S2 on f yields 


S2f(xi, Le, 03) = flats, ta, Te) 
If we follow this by the operation Ss, we obtain 
Si8.f(0, te, 23) = Sp f(r, 3, Co) = f(r, Loy 01) 
since Ss permutes | into 3, 2 mto 1, and 3 into 2. Thus 


S5Sof (x1, 22; 3) = Sef (x1, X29; ra) 


cae bites cestete at ~) CS GC > th 
and it is natural to define S3S, = Se, written Gee 139) = \ao4 We can took 


Z 
upon Ce (123) as follows: Starting with the right-hand side, we see that | goes into 
|, then, moving to the left, we see that 1 goes into 3. The final result is the permuta- 
tion of Linto 3. Again, on the right-hand side, 2 goes into 3, and, moving to the left, 
3 goes into 2, so that the end result is to leave 2 invariant. 3-— 2 followed by 2— 1 
123 

vields 3-—> 1. The product yields = S»s. It follows that S; plays the role of 
the identity element of the group, and we leave it as an exercise to show that the ele- 
ments S,, 2 = 1, 2,..., 6, do, indeed, form a group relative to multiplication 
defined above. The order of the group is 3! = 6 — Let the reader obtain a generaliza- 
tion for the permutation group of order n!', often called the symmetric group. 


We consider now the function 
f(01, £2, 23) = (tr — te) (re — 2a) (Fa — 11) 


We note that Sif =f, Sof “= ae Sof = =f, Saf Pa J; Sof =f, Sef aa —f, 80 that Si, 


S,, and S; leave f invariant. These elements are called the even permutations. It is 


GROUP THEORY AND ALGEBRAIC EQUATIONS 303 


a simple matter to show that the product of two even permutations 1s again an even 
permutation. Hence, from Theorem 8.3, the set (Si, Ss, Ss) is a subgroup of the 
symmetric group of order 3!. Do (Sz, Ss, Ss) form a group? These are the odd 
permutations. 

Any subgroup of a symmetric group is called a regular permutation group. 


Problems 


1. Show that the elements of ISxample 8 4 forma group. Construct the multiplica- 
tion table for this group  Isthis group Abehan? Construct all the proper subgroups. 
2. Show that the product of two even permutations is an even permutation, Con- 
sider the cases of the product of an even with an odd permutation and the product of 


two odd permutations. 
: 12: ’ 
3. We can write Sq, = Le = (123) in the sense that 1 » 2,2 »3,3- 1 S6 
ean be written Ss = (13)(2) Do the same for S;, Se, Ss, Ss 
4. Show that the permutation group of order n! contains a subgroup of order n!/2. 
5. Cis called the commutator of two elements A and B of a groupif C = (AB)"!BA. 


For any element YX of G show that Y~'CY 1s the commutator of X~!AX and X7!1BX., 


8.4. Isomorphisms. Let us consider two groups G; and Ge. If a 
one-to-one correspondence can be established such that 


Aes A’ 
B > PB’ 
implies 
ABe> A'B’ 


for all 4 and B inG, we say that the two groups G; and G» are isomorphic 
to each other and write G; &=G». A’, B’,. . . belong to Ge. 

An isomorphism of two groups implies that the two groups are equiv- 
alent in the sense that we are using two different languages to describe 
the elements of the groups. It is apparent that any theorem obtained 
from the fact that G; is a group will also hold for Go. 


EKrample 8.5. We consider a cyclic group of order 4 with elements A, A?®, A’, 
A‘ = FE along with a subgroup of the symmetric group of order 4! Let the reader 
show that the elements 


_ i) a es) oi oo) 
= (OR 52 = 3412 Ss = 4123 a= 1234 


form a cyche group with S. = Si, S83 = Sj, Sa = HE = Si. It is a simple matter to 


show that the correspondence 


is an isomorphism. We leave this as an exercise for the reader. 


An important theorem due to Cayley is stated as follows: 
THEOREM 8.4. very finite growp G ts isomorphic to a regular permuta- 
tion group. Tet the elements of the group be written as KH = Ay, Ag, 


304 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
., An. Let A, be any member of G, and consider the elements 
A,Ay, AyAo,... , AgAy,. .. , ArAn 


Since A;,A, belongs to G, it must be equal to an A,, and by Axi we mean 
A,. Similarly A,Ae = Age, ete. Let A, correspond to 


1 a ee 9 
= , 8 J 
Ax —H a EO: -¥ 4 7 kl ke etc ( ) 
fork = 1, 2,...,n. We wish to show that (8.1) represents an 1so- 


morphism. We have from (8.1) 


1 QZ + te 4a 
Ao( ae ") 


1 a | 
a Aner( it Gka> > 2% es 


Bui! 2 °° n Ih eS n\_f{ 1 2M 
AGL 92> jn RL be kn} Gk jk + + ghn 


which establishes the isomorphism. The tsomorphism of Example 8.5 
was established in this fashion. 

An automorphism is an isomorphism of a group with itself. A simple 
example of an automorphism is as follows: Let A,,2 = 1, 2,...,n be 
the clements of a group G, and let X be any element of G. For conven- 
ience we choose X ~ E. Let us construct the elements X-14,X,7 = 1, 
2,...,n. IfGis Abelian, we have X¥-1A,X = A,X-'X = A,E = A,. 
Let us assume that G is non-Abelian. Generally, then, X—~1A,X # A,. 
It is a simple matter to prove, in any case, that X¥~'A,X ~ X—!A4,X, 
tj, for if X-“'A.X = X—'!A,X, then 


A, = XXTA, XX"! = XX A, XX! = A, 


a contradiction if 7 #7. Since the set S of elements X—!A,X, ¢ = 1, 
2,..., Mm, contains n distant elements and since every member of S 
belongs toG, S =G. We show now that the correspondence 


A, X7A,X oe eae ae) (8.2) 
is an automorphism. We have A,<«> X~1A,X, 
A,A, 3 X7'A,A,X = XOA,XX1A,X = (X71A,X)(X71A,X) 
which proves the automorphism of the correspondence (8.2). 


Problems 


1. If G, and G2 are isomorphic and if G; is cyclic, show that G- is cyclic. 
2. Find the regular permutation group which is isomorphic to the group of Table 8.1. 


GROUP THEORY AND ALGEBRAIC EQUATIONS 305 


8. Show that all cyclic groups of order n are isomorphic (see Prob. 1). 

4. An automorphism of the type (8.2) of the last paragraph is called an inner 
automorphism. Show that the only inner automorphism of a cyclic group is the 
trivial automorphism, A,«> A,, 7 = 1, 2,..., n. This is called the identity 
automorphism. 

5. Prove that the product of two automorphisms is an automorphism. Hint: If 
A, © A} is an automorphism and A, AS is an automorphism, then under the 
second automorphism we have A, <> (A.)”, so that A, (A_)” is another corre- 
spondence, called the product of the two automorphisms. Show that A, <> (A{)” 
represents an automorphism. 

6. Show that the mner automorphisms of a group form a group. 

7. Show that the automorphisms of a group form a group 

8. Find the inner automorphisms of the group given in Table 8.1. 


8.5. Cosets. Conjugate Subgroups. Normal Subgroups. LetGbea 
group and Hf a proper subgroup of G. Let the elements of H be H,, Ho, 
...,H,,, and let A be an element of G not in H. We consider the ele- 
ments H,A, H.A, ..., H,A. The reader can quickly venfy that 
H,A #£ H,A fori #7, and H,A # H,. lf H,A = H,, then A = H7'H, 
is in H, a contradiction. Thus the set of elements H,A,71 = 1,2,... , 
n, are distinct and do not belong to H if A does not belong to HT. This 
set is not a subgroup of G since it does not contain the identity element 
E. We call this set a coset, written as HA. If the elements of H and 
HA exhaust G, we can write G = H+ HA. If not, we consider an 
element B of G, B not in /] and HA. We construct the coset HB and 
omit the proof that the elements of //B are distinct from the elements 
of H and HA. We can continue this process until we exhaust G, if G 
is a finite group. Thus 


G=H+HA+HUB+---+4HC (8.3) 


Example 8.6. We consider the symmetric group of order 4!. Let A consist of the 


elements S,, S2, S3, Ss of Example 8.5 Let A be the element CG oe ) Now 


2143 
et Ce. a 
“™” ““\2341/ \21437  \38214 
sd = (1234 Co 
"2" \34127 \2143/ \4321 


ss) _ (1234 
= ~\4123/ \2143/  \1423 


SA SCs.) Gee, = (3733 
eT MAO LAT NOVAS) ~~ XO 4S 


Note that S,A #8S,A, 1 #9, and that S,A #S, for all 1, 7. The elements S,A, 


1 = 1, 2, 3, 4, are members of the coset HA. The element B = Ca is not a 


member of H or HA. We consider 


306 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


eae ae Gres - (1234) 
™ “~A2341/\2413/ °° \8124 


we) a) 
"X38 4127 \241387 ~ \4231 
se ae) 
~ (4123/7 (24137 ° \1342 


a Cea. 
” “"N129347\24137 °° \2413 


The clements S,B, 2 = 1, 2, 3, 4, are members of the coset HB. Note that S.B # S, 
for all 1, j, S.B # S,A for all 2, 7, S.B # S,B for 7 #4 7. Continuing, we can obtain 
G=H+HA+HB+HC+ HD + HF. 


S33 


THEOREM 8.5. The elements X and Y belong to the same coset 1f and only 
if XY-! 7s in H. Assume XY! a member of H, so that XY-! = A,, 
H, n H. Then X = A1Y, and H,.X = H,H,Y = H,Y for all 7, with 
H; = H.H,;. Thus every member of //X is a member of HY. From 
Y = AX, HY = HH yX = I,X for all 7, we have that every mem- 
ber of HY belongsto HX. ThusHX = HY. Conversely, if HX = AY, 
then H, and A exist such that H,X = H2Y so that XY-! = Az'A,, 
a member of H. Q.E.D. 

DeFINITION 8.1. If B = X—'AX, we say that B is conjugate to A, 
with A, B, X inG. We note the following: 

1. Every element is conjugate to itself, A = EAE. 

2. If B is conjugate to A, then A is conjugate to B, since B = X-1AX 
implies A = XBX7! = (X—!)-'BX—. 

3. If A and B are conjugate to C, then A and B are conjugate. We 
have 


A = XCX B= Y CY = YOXNAX")Y = (X71Y)7A(X-'Y) 


Let us consider a subgroup H of G consisting of the elements E, A, 


B,.... Let X be a member of G not in H. We form the elements 
X7BEX = BE, X-1AX, X“'BX, .. . and designate this set as Y¥-"1X. 
Since 


(XTAX)(XBX) = X7(AB)X 


we have from Theorem 8.3 that the set X~7X is a subgroup of G. We 
have used the fact that H is a group, which implies that AB is in H if 
A and B are in H. We call H and X-!HX conjugate subgroups. By 
considering Y—'//Y, etc., one can construct the complete set of conjugate 
subgroups of H. 

DEFINITION 8.2. If H isidentical with all its conjugate subgroups, then 
H is called a normal, or invariant, subgroup of G. 


GROUP THEORY AND ALGEBRAIC EQUATIONS 307 


Leample 8.7. The even permutations of a symmetric group form a subgroup, H. 
If X is any odd permutation, X-!AX is an even permutation if A is an even permuta- 
tion. Thus X71HX = H for all X so that H is a normal subgroup of G. 


THEOREM 8.6. The intersection of two normal subgroups is a normal 
subgroup. By the intersection of two subgroups H,, H2. we mean the 
set of elements belonging to both H, and H., written H,(\ He. It is 
obvious that the identity clement E belongs to Hi0\ He If A and B 
belong to H1 ™\ A, then AB belongs to H; and to Hz so that AB belongs 
to Hi (\ He From Theorem 8.3, 1, He is a subgroup of the finite 
group G. We must show now that JJ; (\ He is a normal subgroup if H, 
and H» are normal subgroups. The set of clements X-!(H, C\\ H.)X 
belong to H, since any element of IJ; (\\ H2 belongs to H, and, moreover, 
H,isnormal. The same statement applies to H2, so that every element 
of X-1(H, (\ He) X belongs to H, (\ Hy. Thus A, CO He is identical with 
all its conjugate groups so that H, (\ H2 is normal. 


Problems 
123 123 
1. The elements 123 213 form a subgroup H of the symmetric group of 


order 3!. Find the right-hand cosets HX of G. Find the left-hand cosets YH of G. 
Show that H is not a normal subgroup of G 

2. Show that the subgroup H, consisting of the elements S,, ¢ = 1, 2,3, 4 of Example 
8.5, is a normal subgroup of the symmetric group of order 4! 

3. If H is a subgroup of a eyehe group G, show that H is normal. 

4. A group G is said to be semple if it contains no proper normal subgroups. Show 
that all groups of prime order are simple | Show that all simple Abelian groups are of 
prime order. 

5. Let M and N be normal subgroups of G containing only the identity E in com- 
mon Show that if M7; and N; are any two elements of A/ and N, respectively, then 
MiNi = NiM,. Hint Consider C = Myz'N7!M,N,, and show that C = E. 

6. Consider the sct of elements FE, Ai, 42, .  . of @such that Ay'X A, = X for all 
X of G. Show that this set is an Abelian subgroup of G, called the central of G. Show 
that the central of G is normal. Find the central of the group given by Table 8.1. 
Find the central of the symmetric group of order 3!. 


8.6. Factor, or Quotient, Groups. The sct of rational integers form 
a group relative to addition (see Prob. 8, Sec. 8.2). The zero element 
plays the role of the identity element. A proper subgroup of this group 
is the set of even integers including zero. The odd integers yield a coset 
of the group of even integers, so that J = I) + I,, where J is the group 
of rational integers, J) is the group of even integers, and J, is the set of 
odd integers. J, is obviously not a group relative to addition since J, 
does not contain the identity element. Let the reader show that Jo is a 
normal subgroup of J. If we add any element of Io to any other element 
of Jo, we obtain another element of Jo. We may thus write Jo + I) = Io. 
If we add any clement of I) to any element of J;, we obtain an element 
of J;, so that Jo + 72, = 17, +10 =]. Finally 7; + 1, = Jo. Thus if 


308 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


we abstract in the sense that we denote all even integers by the single 
element /) and denote all odd integers by the single element J;, we note 
that the two elements Jo, J; form a group relative to addition, with J 
serving as the identity clement. We generalize this result as follows: 

Let N be any normal subgroup of the group G. We consider N along 
with its cosets, 


N,NA,,NAz...,NA, (8.4) 


Since NV is normal, we have X-'NX = N, that is, every element of the 
set X~'NX belongs to N, and, conversely, every element of N belongs to 
XINX. Thus NX =~ XN, and NA, = A,N,?i = 1,2, ...,7, 80 that 
the right cosets of N are equivalent. to the left cosets of N. Let us con- 
sider the product of two cosets of N. We have 


(NA,)(NA,) = N(A,N)A, = N(NA,)A, = NN(A,A,) = N(A,A,) 
with N(A,A,) another coset of N. Conversely, 
N(A,A,) = NN(A,A,) = N(NA,)A, = (NA,)(NA,) 


If we look upon a coset as a single element, we note that the set (8.4) 
forms a group. ‘he element N is the identity element for this group 
abstracted from the group G and the normal subgroup N. 

DEFINITION 8.3. The group whose clements are constructed from the 
normal subgroup N of G@ and the cosets of N is called the factor group, or 
quotient group, of N, written G/N. The order, or number of elements, 
of G/N is called the index of the factor group. 

TuHeoreEM 8.7. Let H be a proper subgroup of G, written H CG. If 
N is a normal subgroup of G@ such that N C H CG, then H/N C G/N. 
Let the reader show that, if N is a normal subgroup of G, then N is a 
normal subgroup of the subgroup //, provided N C JJ CG. Thus 


H=N+NA,+N4.4+:°:-°--4WNA, 
and G/H = (N, NA, ...,NA,). Now 
G=N+NA,+NA,+-::+NA,+NX+-°>-°-4 NY 
with G/N = (N,NA,,...,NA.,NX,...,NY). Itis obvious that 


G/H CG/N. 
DeErFinitTIon 8.4. The product HK of two subgroups H and K of G is 
the set of elements h,k,, where h, ranges over H and k, ranges over K. 
THEOREM 8.8. If H and K are normal subgroups of G, their product 
L = HK is a normal subgroup of G. First we show that HK is a sub- 
group of G and that HK ~ KH. Certainly the associative law holds 
since H and K are in G. Moreover kyh2k;-! = hz since H is normal, 


GROUP THEORY AND ALGEBRAIC EQUATIONS 309 


so that kihe = hsk; and (Aiki) (hoke) = (hihs)(kike) = hk C HK. The 
identity element of HK is e, the identity element of G. We leave it to 
the reader to show that HK ~ KH and that the inverse of every element 
of HK is again an element of HK. Finally, for any element X of G, 


XO(HK)X = X7(AXXOR)X = (X1AX)(X1KX) ~ HK 


since H and K are normal. Thus //K is identical with all its conjugate 
subgroups so that HK is normal. 

DEFINITION 8.5. By a maximal normal group N of G one understands 
that N is not contained properly in any normal subgroup of G@ other than 
@ itself. 

THEOREM 8.9. Let N,; and Ne» be normal maximal subgroups of G, D 
their intersection, written D = N,(\ Ne. Then 


Y/N, SND G/N, N2/D (8.5) 


The intersection of two subgroups of @ is the set of elements common to 
both N; and Ne. J is nonvacuous since the identity element obviously 
belongs to D. The reader should refer to Sec. 8.4 for the definition of an 
isomorphism. 

From Theorem 8.8 the product N,N». is a normal group. Obviously 
N,N: contains both N; and Ne. Since N,; and N» are maximal, it 1s 
necessary that N,N2e =G. Since any subgroup of a normal group is 
normal, it makes sense to speak of N,/D. Now 


N.S DA DAY Dag se te DA, (8.6) 
with A, # A, for 7 #7, and the cosets DA, are distinet. Let L be the 
set. of elements belongmg to Ne, NeA,, 7 = 1,2, ... ,7, so that 

L = Ne + NoAy + NoAo + aca + NA, 


We show first that L = G. We have 


G = NiN2 = NN, = NAD+ DA, + wr + DA,) 
= No+ NeA; + Aes + NA, 


since NoD =~ No. The cosets NeA,, 7 = 1, 2,..., 7, are distinct, for 
N2A, = N2A, implies N2A,A,—-! ~ No, which further implies that A,A,~! 
is a member of Ne. Moreover A,A,—! is a member of N; since A, and 
A, belong to N; [see (8.6)]. Thus A,A,—! belongs to both N, and N~ 
and hence to D. This implies DA,A,-! ~ D, DA, ~ DA;, a contra- 
diction to (8.6). The correspondence of cosets, DA, N2A,, i = 1, 
2,...,7, with Do N, yields the isomorphism N;/D ~G/N>. Simi- 
larly N2o/D G/N, Q.E.D. 


310 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Example 8.8. We consider the cyclic group G of order 6 with elements a, a?, a’, a‘, 
a5, a = e, The three proper subgroups of G are N;(a?, a‘, a® = e), N2(a’, a® = e), 
N;3(e). Let the reader show that N; and N2 are maximal normal subgroups of G. We 
also have that D = Ns = Nif\ Nz. We have 


G = (a’, a4, e) + (a’, at, e)a = (a’, at, e) + (a, a, a®) 


G = (a3, e) + (a3, e)a + (a3, e)a? = (a3, e) + (a4, a) + (a5, a?) 
G/N, as [(a?, a‘, €); (a, a’, a’)] 
G/N»o ae [(a3, €), (a4, a), (a5, a*)| 


N,/D = {(a?), (a4), (e)] 
N2/D = [(a’), (e)] 
The correspondence (e) <> (a?, a4, e), (a3) < (a, a3, a5) is an isomorphism, N2/D = 
G/N,. Similarly Ni/D G/N. By (a2, a‘, e) - (a, a3, a®) we mean all elements 
obtained by multiplying elements of (a2, a4, e) with elements of (a, a3, a5). Note that 
(a?, a‘, e) - (a, a®, a5) = (a, a3, a®) so that the element [(a?), (a‘), (e)] is the unit element 
of G/N. 
Problems 
1. Find the maximal normal subgroups of the cyclic group of order 12. Consider 
any two such maximal normal subgroups, and show the isomorphism (8.5) by con- 
structing their cosets and finding their intersection. 
2. Find the normal subgroups of the group of Table 8.1. 
3. A group is said to be simple if it has no normal subgroup other than itself and the 
identity element Show that any group of prime order is simple. 
4. Show that G/H is simple if and only if H is maximal 
5. Let N be a normal subgroup of G, H any subgroup of G. Let D = H\N, 
L = HN. Show that HN is a group, D anormal subgroup of H, L/N = H/D. 
6. Find the maximal normal subgroups of the symmetric group of order 24. 


8.7. Series of Composition. The Jordan-Hélder Theorem. Let N be 
a normal subgroup of G. One then obtains the factor, or quotient, group 


G/N =[N,NA,,NAs,...,NA,] 


where NA,,2 = 1, 2,... , 7, 18 a coset consisting of the elements N,A; 
with N, ranging over the group N. Let T =[N, NB,,...,NB,| bea 
normal subgroup of G/N. We prove the following theorem: 

THEOREM 8.10. Every normal group of a factor group G/N yields a 
normal group of G; each normal group of G which contains N corresponds 
to a normal group of the factor group. The first part of Theorem 8.10 
states that the set 

H=N+NB,+:---+NB, 


is a normal subgroup of G. Now 


X1HX = X-\(N + NB, +--+ +NB,)X 
= X-INX + X-INB\X +--+ + X-!NB,X 


Since N is normal, we have X-!1NX = N for all X inG. Since 7 is nor- 
mal, (VX)—'(NB,)NX isin T. Thus X~-1(B,;N)X is in T and hence is an 


GROUP THEORY AND ALGEBRAIC EQUATIONS 311 


element of H. Thus X-'!HX is an element of H for all X of G, which 
proves that H isnormal. It is obvious thatG >) H DN. Now assume 
that G > H DN with both H and N normal. From 


H=N+NA,+NA.+ °°: +NA, 
G=N+NA,+NA,+--- +NA,+NAui +++ ° +NA, 


it follows that H/N C G/N. We wish to show that H/N is a normal 
subgroup of G/N. From the fact that H is normal and that A, belongs 
to H we have that A>'A,A, belongs to H. Thus 


(NA,)~"(NA,) (NA) 


As1A,NA, 

(A>1A,A,)N since N is normal 
= N(A>1A,A,)) since N is normal 
= N(NA,) =~ NA, 


R 


Thus the conjugates of H/N are members of H/N so that H/N is 
normal. It may be that A>!A,A, = E, the identity element, so that 
N(A71A,A,) = NE=N. Q.E.D. 

DEFINITION 8.6. Let N,; be a maximal normal subgroup of G, No a 
maximal normal subgroup of N;, etc. Weobtain aseriesG, Ni, No... , 
N, = E, called a series of composition. 

The factor groups G/N, Ni/No, . . . , Nu—1/N; are all simple groups 
(see Prob. 3, Sec. 8.6) for the definition of a simple group. If N./Ni41 
were not simple, Theorem 8.10 states that a normal group M would 
exist such that N, > M > N41, a contradiction, since N,,,; is assumed 
maximal relative to N,. These simple groups are called the prime factors 
of composition of G. The orders of G/N,, Ni/No, ... are called the 
factors of composition of G. 

THEOREM 8.11. (Jordan-Holder). For two series of composition of a 
finite group G the prime-factor groups are tsomorphic. 

This means that if we consider two arbitrary series of composition of 
G, say 


G =H), H,, H.,,...,H,=E (8.7) 

G = Ko, Ki, Ko,...,K, = : 
with prime-factor groups 

G/H,, H;/Ho,... , Hy-1/H, (8.8) 

G/K,, Ki/Ke,... , Ks-i/Ks 


then r = sand a one-to-one correspondence, 7 <> 7, can be set up such that 
A/F: = y/ Koti (8.9) 


with 7 and 7 ranging from 0 to r. 


312 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The theorem is evidently true if the order of G is prime, for in this case 
there exists only one normal subgroup of G, the identity element, and 
G/E =G/E. The proof of Theorem 8.11 is by induction. Assume the 
theorem true for any group G whose order, g, can be written as the prod- 
uct of n prime factors or less. We know that the theorem is true for 
n=1. Now letG be any group whose order can be written as the prod- 
uct of m + 1 prime factors, and let us consider two arbitrary series of 
composition [see (8.7) and (8.8)]. If H; = Ay, then G/H,; =G/K, and 
the set of elements of H, is a group whose order contains at most n prime 
factors. By hypothesis H,/H,., = K,/K,,, for 721. We assume, 
therefore, that H, and K, are distinct maximal normal subgroups of G. 
From Theorem 8.9 


G/H;&Ki/D, G/K,=H1/D,; (8.10) 


where D, = H,f\ Ky. Since G/A, and G/K, are simple, of necessity, 
K,/D, and H,/D, are simple. Henee D,; is maximal (see Prob. 4, See. 
8.6). Now form the series of composition 


G,Ai, Di Dias 4 y Die = 
G, Ay, Di, De...) Dn = 


= 


(8.11) 


From (8.10) and the fact that D,/D2 = D,/Dez, De/ Ds = D2/Ds, ete., 
we see that the two series of (8.11) satisfy Theorem 8.11. But the series 
(Hi, He,...,H,), (Mi, Di... , Dn) satisfy Theorem 8.11 by assump- 
tion, as do the series (K;, Ko,..., K;), (Ki, Diy... , Dn). Thus 
Theorem 8.11 holds for the two series (41, He,... , H,), (Ki, Ke,..., 
K,), so that with (8.10) it is seen that the Jordan-Hélder theorem holds 
for any group G whose order can be written as a product of n + 1 prime 
factors. This concludes the proof by induction. Example 8.8 is an 
illustration of the Jordan-H6lder theorem. 


Problem. Construct the multiplication table for the octic group whose elements 
are u'vy) (4 = 0, 1, 2, 3; 7 = 0, 1) with ut = 1, v? = 1, vu = u*v Find the different. 
series of composition and prime-factor groups for the octic group, and show that the 
Jordan-Holder theorem applies. 


8.8. Group Characters. Representation of Groups. If to each ele- 
ment s of a group G we can assign a nonvanishing number, real or complex, 
written X(s), such that 

X(s)X(t) = X(st) (8.12) 


we say that we have a one-dimensional representation of the group and 
we call X a group character. From (8.12) we have 


X(s)X(e) = X(se) = X(s) 


GROUP THEORY AND ALGEBRAIC EQUATIONS 313 


so that X(e) = 1, where e is the unit element of the group G. A trivial 
group character is X(s) = 1 for all s of G. 


“Wcample 8.9. Let G be a cyche group of order n with generator a, a” = e. Let us 
define X(a") by the equation 


X (a™) = e(2e/nyr Re Ng 2. he gh (8.13) 

We have X (a7) X (a8) = e@m/mreQ@m/ma = elmsmirts) = Natt), so that (8.13) defines 
a character of the group A set of characters Xo, Xi, 0. . , Xn—1 can be defined by 
N,(a") = e@mmer yy = 0,1,2,...,n—-1 (8.14) 


Let us note that 


n 


. a a n if uhM> vp 
r rT) — (2mi./n)r(v ) 
X 4 (a7) X,(a’) e M 0 ee (8.15) 


[D4 


7 =] r=] 


The group characters of (8 14) are orthogonal in the sense that (8.15) holds | Moreover 


n—I] n—] : 

- n ifr =s 

A y(a*)X,(a’) = eT n)O> Yk se : 8.16 
> pia) X,(a") 0 ifr +s ( ) 
u=O0 p=0 


Let us now consider a set of n X n matrices Ay, As, .. . , A, and a group G with 


elements gi, gz, .  , gx We assume that a one-to-one correspondence exists such 
that 

Ay 

A,— gq, 
implies 


ALA, > gig, 


If such 1s the case, 1t 1s easy to show that the set of matrices is a group, A, isomorphic 
to the group G. We say that the group of matrices, A, is a representation of the 
group G. Any matrix A = |la‘|| can be thought of as representing an affine (linear) 
transformation y* = ajx’ (see Chap. 1), so that a representation of a group implies 
that a group of affine transformations 1s isomorphic to the group G. Let the reader 
show that, if g. <> A, isa representation of G, then g, < B~'!A,B is also a representation 
of G, |B] # 0. 

Example 8.10 A representation of the symmetric group of order 6 is easy to con- 

123 
struct. The identity permutation C 9 >) can be looked upon as the transformation 
of coordinates £ = x, § = y, Z = Z, or 
X= le + Oy + 02 


Or + ly + 02 
z=O0r + Oy + lz 


= 
I 


yielding the unit matrix 


1 0 O 
A; = O 1 0 
0 O |] 


; 5) can be looked upon as the transformation of coordinates 


bl 


The permutation ( 


314 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


E=y = 0c + ly + 02 
g=r=14+0y + 02 
Z=2=0r + 0y + lz 
yielding the matrix 
0 1 0 
A,=]1 0 0 
0 0 II 


Let the reader obtain the matrices corresponding to the permutations 
CG.) c) Ce) ce. 
L2 2334 312 321 


Since every finite group is isomorphic to a regular permutation group (see Theorem 
8.5), it follows very simply from Kxample 8 10 that one can always obtain a representa- 


tion of a finite group. 


It may happen that a matrix B exists such that B~!A,B has the form 


B,(A,) 0 e Ges 0 | 
reese heasaasee | Te CD 

0 0 +++ B,(A,) 
where B,(A,), 7 = 1, 2,... ,k, are matrices — If this is so, the original 


representation is said to be reducible. One easily shows that the set of 
matrices {B,(A,), B,(Ay), . . . , B-(A,)} 1s a representation of G. Thus 
(8.17) yields k new representations of G. It may be that further reduc- 
tions are possible. If not, the representation given by (8.17) is said to be 
irreducible. In this case the correspondence is written 


qi P(A.) re B,(A,) + B.(A,) ye ae B,(A,) (8.18) 


where the sum in (8.18) denotes the matrix (8.17). Methods for deter- 
mining the irreducible representations of a group are laborious. The 
reader may consult the references cited at the end of this chapter for 
detailed information on irreducible representations. 


Problems 


1. Derive (8.16). 
2. Find a representation for the group of Table 8.1 
3. If the correspondence g, <> A, is a representation, show that X(g.) = |A,| is a 


character of the group G. 
4. Find a character for the group given by Table 8.1 other than the trivial character 


X(g.) = 1 for all g, of G. 


8.9. Continuous Transformation Groups. Let us consider the trans- 
formation of coordinates given by 


¥1 = o(2, y; 2) oy) 


GROUP THEORY AND ALGEBRAIC EQUATIONS 315 


where ¢ is a parameter ranging continuously over a given range of values. 
The transformation (8.19) will be said to be a one-parameter transfor- 
mation group if the following is true: 

1. There exists a to such that 


x = f(x, y; to) 
y = g(x, y; to) 


This value of ¢, = to, leaves the point (x, y) Invanant. This corresponds 
to the identity transformation. 

2. The result of applying two successive transformations 1s identical 
to a third transformation of the family given by (8.19). This implies 
that if ve = f(x, yi3 t1), Ye = e(X, y13 f1) then 


te = f(f(r, ys, e(@, y3 O36) = flr, y; te) 
yo = (f(r. ys), elt, ¥5t)3 t1) = ela, y5 te) 


where f2 is a function of ¢ and ¢). 

3. The associative law must. hold. 

4. Each transformation has a unique inverse. ‘This implies that we 
can solve (8.19) uniquely for z and y such that 


x = f(t, Yij ti) 
y = 9(%1, 1; t1) 
with ¢; some function of ¢. 


Example 8.11. The translations 1) = © +6, y: = y + 2b satisfy the requirements 
for a one-parameter group The identity transformation occurs for b = 0, and the 
inverse transformation is obtained by replacing b by —b 

The rotations 

a = sr cos @é — y sin é 
yi = xrsin 6 + 4 cos 6 


form a one-parameter group. @ = 0 yields the identity transformation — If 
ts = 7) ¢O8 6) — ys Sin Oy 
Y2 = 21 Sin A; + y; cos 6, then 


Le = x cos (6 + 41) — y sin (6 + 86,) Y. = xr sim (0; + 62) + y cos (0; + 02) 


The parameter @ 1s additive for two successive transformations 
A third example is the group 4, = ar, y: = ay. Theidentity transformation occurs 
fora =1. The parameter is multiplicative for two successive transformations. 


Infinitesimal transformations can be found as follows: Let to be that 
value of the parameter ¢ which yields the identity transformation. Then 


t= Hay Y, fy) 


y = gl, Y; bo) e-20) 


If (8.19) is continuous in ¢, a small change in / will yield a transformation 


316 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


differing from (8.20) by a small amount. The transformation 


Ly 
Y1 
is said to be an infinitesimal in the sense that x, and y, differ from z and y, 


respectively, by small amounts when e¢ is sufficiently small. Subtracting 
(8.20) from (8.21) yields 


f(a, d 5 to + é) 
elt, yj to + €) 2h) 


I 


dx = 2%, -— x= f(r, y;to te) — f(z, y; to) 
by = i — ¥Y = (2, ys to + €) — ofa, y; to) 


If f and ¢ are differentiable in a neighborhood of ¢ = to, one has 


6x = € (2) = &(7, y) de 
ag ° (8.22) 
by = € (ey = n(x, y) ot 
except for infinitesimals of higher order, e« = 6¢. 
Example 8.12. For the rotation group of Example 8.11 one has (3) Raper 


(3) bo 2 80 that (8.22) becomes 
6c = —y Ol dy = 2x bt 


Let us consider the change in the value of a function F(z, y) when the 
point (z, y) undergoes an infinitesimal transformation. One has 


oF oF 
6F = F(x, yi) — F(z, y) = - 6x + ay by 


except for infinitesimals of higher order. From (8.22) one obtains 





, _ [{,oOF or 
The operator U defined by 
0 0 
Ce ea" a, (8.24) 


is very useful. Equation (8.23) can be written as 6F = (UF) 6t. If 
t = 0 corresponds to the identity transformation, we can replace é¢ by ¢, 
remembering that ¢ is small. 

It is interesting to note that if one is given the system of differential 


equations 
dx 1 


dt 


d 
— = (21, Y1) yi(0) = y 


(x1, 1) zi(0) =z 
(8.25) 


GROUP THEORY AND ALGEBRAIC EQUATIONS 317 
then the solution of (8.25), written 
r1 = f(z, y;b) 
8.26 
Y1 = p(x, Y; t) ( ) 


is a one-parameter group additive in ¢. To show this, we first note that 
(8.26) yields a curve starting at the fixed point (a, y). As ¢ varies, x,and 
y; vary (see Fig. 8.1). 

Now let us consider the system 


d 
5 = E(Xo, Y2) v2 = 1 at ty = { 
1 
dy» (8.27) 
Te = (Xo, Y2) yo=yiath = 
If we let fg = ¢, — ¢, t fixed, (8.27) becomes 

d 

di = £(1e, Yo) Y2 = x, atte = 0 

a (8.28) 


—— = (22, Y2) ye = yiatt, = 0 
dt» 


Since (8.28) has exactly the same form and initial conditions as (8.25), 
of necessity, 


Xo 
Yo 


At ¢ = t; we have ro = 21, Y2 = yy. It is obvious that (8.29) yields the 
same curve as given In Fig. 8.1, and 

for t; > ¢ we obtain an extension of y F, (Xo, Y2) 

I to the point (xe, ye). Thus the ; at {>t 
product of two transformations be- 
longs to the group, and the parame- 
ter tis additive. The establishment P (xy, y,) at t 
of the inverse transformation is left 
to the reader. 

Let us now begin with a group 
transformation given by (8.19) and 
attempt to establish a system of 
differential equations involving 2, 
y:,and ¢. From the group property Fic. 8.1 
we have 


f(a, yrs te) = f(t1, yrs hi — 8 
Cones ‘ 8.29 
e(%1, Viste) = (Xi, Yish — C) ( ) 





Ly 


v2 f(x, y; BC, ti)) = f(x, yi; ¢1) (8 30) 

yo = 9(2, y; BUE, ti) = elas, ys; ts) 
when x, and y; are replaced by the values as given by (8.19). We choose 
¢ = 0 as that value of the parameter which yields the identity element. 


318 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Of necessity, 6(¢, 0) = ¢, since f(z, yi; 0) = a1 = f(x, y; t). We differ- 
entiate (8.30) with respect to t,, keeping x, y, and ¢ fixed. Thus 


Of(z, 3B) 9B _ Of (a1, yr; ty) 
~ OB at, Ot; 


Evaluating at ¢; = 0 yields 





of(z, y; t) ) _ Of(x1, Yr; t1) _ 

evi) (98) Memb)! oe, ys 

dy(z, y;t) (#2 _ 09 (2, Yryt)| 

at OtiSino dt; pee ma, ys) 
From (8.19), a = Sie, vid, “us = Sele, Vid, so that 

dx 
ae = A(t) E(™1, Ys) 
on (8.31) 


= 
where A(t) = (2) - If ¢ is additive, we have @(t, 4:3) = ¢ + 44, 
1/ ty=0 


A(t) = 1 and (8.31) becomes (8.25). In any case 


dx, al 


so that the change of variable, u = {d(t) dt, reduces (8.32) to (8.25). 
Thus one can always find a new parameter such that the group transfor- 
mation is additive relative to the parameter. 

Example 8.13. We consider = x? “us = 1,withz =27,y1 = yatt =0. The 
solution is 7} = r/(1 — zt), y, =t-++ y. Let the reader show that an additive group 
has been obtained. 


Let us now consider an arbitrary function of x and y, say, F(z, y). 
The value of F(z, yi) for ¢ small can be obtained as follows: Since 
F(a, yi) = F(f(2z, y; 4), ¢(a, y; t)), we expand F in a Taylor series about 
¢ = 0, assuming that this is possible. Thus 
a’F 


qe 
t=O dt? |,m0 


{2 


dF (a, 
(1 st ++ (8.33) 


F(x, Y1) = F(a, Y1) | + dt 
t=Q 








Now F(a, y1) 





= F(z, y). Also 


GROUP THEORY AND ALGEBRAIC EQUATIONS 319 


I dz, , oF ays) 
dx, dt ' dy: dt 


dF (x1, 1) 
dt 





t=O 
oF 
= = E(X), Y1) + ay n(X1, |, 


_ OF (a, y) oF se y) 
Oa e(@, 


y) + n(x, y) = UF see (8.24) 


provided the group transformation is additive. Moreover 








— Lae pases E(21, 91) + yt E(a1, yx)n(t1, Y1) 
+ ES wou a) + a ee + oy ae I. 
From 
UF (x,y) = (22 a r2) | EG. w) ie ED 
UE (@1 yi) 


one easily shows that = U°*F(z,y). By mathematical 





dt? 
induction it can be shown that 


dF (x4, Y1) 
dt” 











= U"F(z, y) 


t=0 
Equation (8.33) can be written in the form 
2 
F(t, 41) = F@,y) + UPL+ (UF) t+ B34) 
For the special case F(2,, y1) = x1 we have 


2 
x(x, y, t) = 2 + (Ux)t + (Uz) si tae 


Example 814. For &(2, y) = —y, n(x, y) = x we have 
0 0 
U = —-y—+2— Ux = —-y U%7 = —z U'x = y Utr =x 
Ox oy 


t 
Yi 


=c(1-y+H--°-:)-¥(t- -st+H- .. +) 


= zxcost — ysint 


t? t3 t4 
so that m=nr-yt-tytuyteg— 


Similarly y, = z sin t + y cos ¢, and the rotation group is obtained. 


320 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problems 


1. Show that the Einstein-Lorentz transformations form a one-parameter group 


= xz-—V 

= —— 
V1 — V?/c? 

De t — V/c®x 
VJ/1 — V2/c? 


Note that the parameter V is not additive. 
2. Find the transformation group obtained by integrating 


dzi _ dyy _ t= 


Le 
ae ae gee SE 


3. If Q(z, y) is an invariant under a group transformation Q(z, y) = Q(x, y1), show 
that UQ(s, y) = 0. 

4. Solve UQ(s, y) = O for the rotation group, and show that x? + y? is an invariant 
for this group. 

5. Given é(z, y) = —y, n(x, y) = —2, find the transformation group. 
dy 


6. The differential equation da 


=f (4) remaims invariant under the group trans- 


; : yo Y1 dy tidy dy 
f tion x = ¢: = ty, for all ¢, since = = — and—— =-—~—- =—=-- L 
ormation x = tx, y yi, for all t, since © = and 7 = 1 ge Let 


v=lIn y=lIn y +1n t=», +a, In t=a_ The equation ay =f @ becomes 


, d ; du\ .. 
I («, v, oA = 0, and F (x, v, a) is Invariant under the translation 


uU=U1,v = +4a 


Thus F must be independent, of », so that F (u, ) = ( can be integrated by a 


2 2 
separation of variables. Integrate dy ae ee. 
dx z?—y7? 





8.10. Symmetric Functions. A function F(a, r2, ... , 2n) is a sym- 
metric function in 7, %2,... , %, if any permutation on the subscripts 
1,2,...,7 leaves f invariant. The function 


f(X1, Le, X3) = Lites + w? + 22 + 2? 


is Invariant under the symmetric group of order 6. If we expand the 
product (a — 21)(%& — 42) + + + (@ — 2n), we obtain 


(t — 41)(@ — 2) + * + (@ — Bn) 
= 77 — ogre oon"? — o3¢"-8 4+ +6. (—1)*o, 


GROUP THEORY AND ALGEBRAIC EQUATIONS 321 


with of, =a, +a2+ 43+ °++ +28 


oo = X22 + U1%3 + °° fF 21Xn + ekg + + + Tn 1k 
o3 = Lx,2,Xy 1#~jJAkK At (8.35) 
On = X\X3 * + * Ly 

Theo,,7 = 1,2... .,n, of (8.35) are called the fundamental symmetric 


functions. It is apparent that any function of the o’s is a symmetric 
function. A few examples lead us to suspect that the converse is also 
true. We note that f(xi, v2) = 13 + 23 can be written as 


(x) + xe)? — 82 3%0(4%1 + Xe) = 0? — 80302 
with o, = 2; + 22, 62 = 41%. The symmetric function 
f(a, V9, X3) = X2X 9X3 + X27 123 + x27 1X9 


can be written as f(x), ®e, X3) = (2273) (X%y + Le + 13) = O30}. 

THEOREM 8.12. The Fundamental Theorem of Symmetric Functions. 
If f(vi, t2, . . - , tn) 18 a Symmetric function (multinomial) in 2x, 2, 

. , tn, then f(r1, T2,. . . , Xn) = Gor, o2,. ~~ , On), that is, f can be 

written as a multinomial in the fundamental symmetric functions o,, 
02, + + + » One 

The proof of the theorem is by mathematical induction. The theorem 
is certainly true for a function of one variable since f(7;) = f(o,;). Let 
us assume the theorem true for a function of n — 1 variables. Thus if 
f(x1, Zo, . . « , Ln-1) 18 Symmetric, then f = g(oi, o2,... , Gn-1). Now 
let f(zi, Yo, . . . , Xn) be any symmetric multinomial. Then f(x, 22, 

. , Ln-1, 0) is asymmetric function in 2, 2,. . . , Yn-1; $0 by assump- 

tion we can write f(@1, %2,. . . , tr—1, 0) = g((oi)0, (o2)0, - - - 5 (On—1)0), 
where 0, = 41 + 22+ °° ° +2n, (O1)0 = 4X1 + %2 + + °° + 2n-1 + 0, 
etc., that is, (¢,)o is the value of o, evaluated at x, = 0, for j = 1, 2, 
...,n. Nowif f(x, re, .. . , %n) 18 a multinomial of degree 1, that is, 
f=a(ai + v2 + °-++ + 2,), then the theorem is certainly true. We 
assume the theorem true for any symmetric multinomial of degree < k. 
This is a double mathematical induction type of proof. Now we con- 
sider h(x, Yo, . - - , tn) =f(r1, 22, . ~~ 5 Ln) — Glor, o2, . . ~ » Oni). 
It is obvious that h is also symmetric. Now 


Wi By Wo, oo 6k: 9 Cast OY SF Oy hes a ey Wee 0) 
_ g((o1)o, (o2)0, ce 8 yg (on-1)0) a 0 


so that zx, = 0 is a zero of h(a, ro, . . . , Xn), which implies that z, is a 
factor of h. Since h is symmetric, of necessity 21, %2,. . . , tn—1 are also 
factors of h. Thus 


NE i Dep eG han) he OO Sy 2H. se Ca) —S CS 


322 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
where s is a symmetric function, and 
f(%1, Le, eo 8 0 4 Ln) = g(o,, 02, “6 6 9 On—1) + on8(Xi, L2, e 8 6 4 Ln) 


The degree of s is k —-n < k, so that, by the induction hypothesis, 
8(%1, Y2,. . . , €n) can be expressed in terms of the fundamental symmetric 
functions, yielding 


f(x, v2, oe 8 8 yg Ln) = g(a, 02, s 8 © 4 Tn—1) —+- ont(o1, C2, ee 6 , on) 


This concludes the proof of the fundamental theorem of symmetric 
functions. 


Kxample 815. We consider 
f(r, to, 73) = 77 + 22 + af + Qrize + Qrizs + Wrox, — Brix27r, — Br?x*r. — Bxrzx2a, 
With patience one can show that fis symmetric. Let us consider 
f(a1, 22,0) = xj + 23 + Qaize = (x1 + 22)? = [(01)o]? 
The function f(z1, 22, 23) — oj should have the factor z:2073. We note that 
f om a; = —325%oX3(XiF2 + 11e3 + Tes) = —3e203 
so that f = of — 30003. 


Problems 
1. Show that 


(1+ r2)(1 + 23) 4+ 2?) = 1 +09 — 2o2 + 63 — 2o103 + 0? 
2. Let a, B, y be the zeros of f(z) = 23 + px? ++ qr +7 Show that 


BONE a ee a at 
a + B a+ y Bry pq-?r 





8. Referring to Prob. 2, show that 


a2B2 + g2y? + By? = g? — Qpr 
4. Consider 


8: = %1 + 22+ + Xn 

2 2 2 
GS =%+%, +t ++ +2, 
S=ai tats + +2) 
S& =i tat: +: +2F 


xpress a3 in terms of 81, Se, 83. 


8.11. Polynomials. We shall be interested in polynomials of the type 


n 


p(x) = > cha® = Co + Ce + cor? + °° - + + Cn2" (8.36) 


k=0 


The coefficients ¢,,7 = 0,1,2,.. . ,n, are assumed to belong to a field F. 
The reader is referred to Sec. 4.1 for the properties of a field. If aset 


GROUP THEORY AND ALGEBRAIC EQUATIONS 323 


of elements belong to a field, we can perform the simple operations of 
addition, multiplication, and their inverses on these clements. The 
zero element and unit element belong to a field and the distributive laws, 
a(b +c) =ab+ac, (b+ c)a = ba + ca, hold. The complex numbers 
form a field. A subfield of the field of complex numbers is the field of 
real numbers. The rational numbers form a subfield of the field of 
real numbers. Let us consider the set of real numbers of the form 
a + b ~/2, where a and b are rational numbers. The sum and product 
of two such numbers is evidently a number of this set. The unit element 
is 1 = 1+0-~/2, and the zero element is 0 =0+07/2. We must 
show that every nonzero number has a unique inverse. From the fact 
that +/2 is irrational we have a + 6 +/2 = 0 if and only if a = b = 0. 
Assume a + b~+/2 ¥ 0. Then it is easy to show that [a/a? — 2b2] + 
[(—b)/a? — 2b?] +/2 is the unique inverse of a + b~+/2. Let the reader 
solve 


at+bvV/2HX(a@+yvV2) =14+072 


for x and y, a? — 2b? ~ 0. We call this field an algebraic extension of 
the field of rationals, written R(+/2). We shall discuss algebraic exten- 
sions of a field in the next section. 

If f(x) and g(x) are polynomials with coefficients in a field F, we say 
that g(x) divides f(x) if a polynomial h(x) with coefficients in F exists 
such that f(r) = g(x)h(x). An important theorem concerning poly- 
nomials is as follows: Given two polynomials with coefficients in F’, say, 
f(x) of degree m, g(x) of degree n, there exist two polynomials with coefh- 
cients in F, r(x) of degree < n, s(x) of degree < m, such that 


r(x)f(x) + s(x)g(x) = d(x) (8.37) 


where d(x) 1s the greatest common divisor of f(z) and g(x). All other 
divisors of both f(z) and g(x) are divisors of d(x). The coefficients of 
d(x) are in F. 

Proof. We consider the set of all polynomials of the form a(x)f(z) + 
b(x)g(x), with a(x) and b(x) arbitrary. These polynomials have degrees 
(exponent of highest power of x) greater than or equal to zero. A con- 
stant is a polynomial of degree zero. Let a(x) = r(x), b(%) = s(x) be 
those polynomials for which the degree of a(x)f(x) + b(#)g(x) is a min- 
imum, but not identically zero, and let 


d(x) = r(x)f(@x) + s(x)g(z) 


Any divisor of both f(x) and g(x) is obviously a divisor of d(x). We show 
now that d(x) divides both f(z) and g(x). If d(x) does not divide f(z), 
then by division (always possible since the coefficients are in a field) 


f(x) = q(x)d(x) + t(x) O < degree of t(x) < degree of d(x) 


324 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
d(x) #0. Hence qd = qrf + gsg = f — t, and 
(1 — qr)f + (—9s)g = ta) (8.38) 


From the definition of d(x), (8.38) is impossible unless f(z) = 0, so that 
d(x) divides f(x), written d(x)/f(x). Similarly d(x)/g(z). Let the reader 
show that degree of r(x) < degree of g(x), degree of s(x) < degree of f(z). 


Example 8.16. Let f(z) = 2? +1, g(r) = x. Then 1- (#? + 1) + (—2)r = 1, 80 
that 1 is the greatest common divisor of f(7) and g(x). We say that f(x) and g(x) are 
relatively prime. 

Problems 

1. Show that the integers do not form a field. 

2. Show that the set of numbers x + zy 1/3, x and y ranging over all rational 
numbers, forms a field 

3. Show that « + 11s the greatest common divisor of f(r) = 23+ 1, g(2) = (4 +.1)?. 
Find r(z) and s(z) such that r(r)f(~) + s(a)g(v7) = x +1. Are r(z) and s(x) unique? 

4. Let f(z) and g(z) be polynomials with coefficients in a field F;, with F,; a subfield 
of the field F. If f(z) and g(x) are relatively prime with respect to the field F1, show 
that f(x) and g(x) are relatively prime relative to the field ’. How does this apply 
to Prob. 3? 

5. Do the set of numbers a + b 1/2 + ¢ 1/3, a4, b, crational, form a field? What of 


atbvV/2+ce¢vV3+d Vb? 
6. Let F be any field. If we label the zero and unit elements 0 and 1, respectively, 
and define 1 + 1 = 2, etc., show that every field contains the rationals as a subfield. 


8.12. The Algebraic Extension of a Field F. Let p(x) = > c.x* be a 
k=0 

polynomial with coefficients in a field F. We say that p(x) is arreducible 
in F if p(x) cannot be written as a product (nontrivial) of two polyno- 
mials with coefficients in F. We say that p is a zero of p(x) or a root of 
p(x) = Oif p(p) = 0. Kronecker has shown that one can always extend 
the field F in such a fashion that a new number op is introduced, yielding 
p(p) = 0. The solution of z? + 1 = 0 involves the invention of the 
number 7 = +/—1. 

THEOREM 8.13. Let p(x) be an irreducible polynomial in F such that 
p(p) = 0. If q(x) has coefficients in F, then q(p) = 0 implies that 
p(x) /q(x). 

Proof. Since p(x) is irreducible, p(x)/q(x), or p(x) and q(x) are rel- 
atively prime. From Sec. 8.11 


A(x)p(x) + B(x)q(x) = 1 


for some A(x), B(x). Thus A(p)p(p) + B(p)q(p) = 0 = 1, a contra- 
diction. Thus p(x) divides q(z). 
Two corollaries follow immediately from Theorem 8.13. 


GROUP THEORY AND ALGEBRAIC EQUATIONS 325 


CoroLLARY 1. p(x) is that irreducible polynomial of smallest degree 
such that p(p) = 0. 
CoroLtuaRy 2. If we make the leading coefficient of p(x) unity, 


p(x) = e*> + Cp-it™ 1 + +--+ +o, then p(x) is unique. 
DEFINITION 8.7. The n zeros of p(x), Say, pi, p2, . - - , pn, are called 
conjugates of each other. The zeros of z? — 2, namely, +/2, — ~/2, are 


conjugates of each other. It follows from Theorem 8.13 that any one of 
the n conjugates of p(x) determines p(x) uniquely, assuming that the 
leading coefficient of p(x) is unity. 

THEOREM 8.14. Let p be a zero of the irreducible polynomial p(x) 


with coefficients in a field F. The numbers 1, p, p’, . . . , p”7! are lin- 

early independent relative to F, since if constants c,, 7 = 0,1, 2,... 
n—1l n—-1 

n — 1, exist in F such that > c.p* = 0, then t(x) = 2 c,c* 18 a poly. 
k=0 k=0 


nomial in F of degree < n with t(p) = 0, a contradiction to Corollary 1. 

An important result is the following: Given the irreducible polynomial 
p(x) with coefficients in F such that p(p) = 0, one can obtain a field F, 
containing p such that F is a subfield of Fy. We call F; an algebraic 
extension of F and write F; = F(p). The elements of F;(p) are of the 
form 


Mm; 





ap" 
gait (8.39) 
» b,p? 
7=0 
with > bp’ # 0, a, in F, b, in F. 
7=0 
Let the reader show that the elements of the type given by (8.39) form 
a field. By using the fact that p” + c,1p"7! + - + + + co = 0 one has 
p” = —(Co + Cip + °° * + Cn-1p"*) 
prt _— — (Cop ao Cp" + o 8 6 + Cn—1p") 
= —(cop + cip? + °° + + Cn—2p"7?) 
+ Cn.4(Co = Cip ae ot Ataaip”™)) 


so that powers of p higher than n — 1 can always be reduced to lower 
powers. For example, if p? + p + 1 = 0, then 


p> = —(p + 1) pt’ = —p— p” 
p= —-p—pP=—-p+pt+l ete. 


We show next that the elements o of (8.39) can actually be written as 


326 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


& 


polynomials in p. The denominator of (8.39) is > b,p’. Let 
7=0 


h(t) = z b,x? 


Since A(p) + 0, of necessity A(x) and p(x) are relatively prime. From 
Sec. 8.11 we have 
A(x)h(x) + B(x)p(x) 
A(p)h(p) + B(p)p(p) 


so that the inverse of h(p) is A(p). Thus 


] 
1 = A(p)h(p) 


o(p) = A (p) ¥ a ae dp’ 


A 1=0 
by using the fact that p(p) = 0 to eliminate higher powers of p. 


Example 8.17. The field R(4/2) is obtained by considering z? — 2 = 0, p? — 2 = 0, 
p = \/2. The same field is obtained if we consider p = — +/2. This is not true for 
all cases. The coefficients of x? — 2 belong to the field of rationals. The elements of 
R(\/2) are of the type a + bop = a + b v/2, a and b rational. 

Next we consider the polynomial equation p(z) = 7? —3 = 0. It is easy to show 
that p(x) is irreducible in the field R(./2) since 1/3 = a + b 1/2, a and 6 rational, is 
impossible. Let the reader show this. Now the coefficients of x? — 3, namely, 1 and 
—3, belong to R(4/2). Hence we obtain an algebraic extension of R(4/2) by con- 
sidering the set of numbers of the type (a + 6 +/2) + (ce +d v2) V3, a, b, c, d 
rational. This is the field R(./2, ~/3), and its elements are of the form a + b 1/2 + 
c+/3+d-/6,4,b,c,drational. Inasimilar manner we could construct R(+/3, +2). 
It is evident that R(/2, 7/3) = R(3, V2). 


THEOREM 8.15. Let a be any number of the algebraic extension F'(p) 
obtained from p(p) = 0. Then a is the zero of a polynomial f(x) with 
coeficientsin F. Wesay that f(x) = 01s the principal equation of x = a. 

Proof. Let the conjugates of p(x) = 0 be p = pi, pa, pa,» - + > Pu 
Since @ is in Fp), of necessity, 


a = a, = Ao + Gip: + Ap? + °° + + An—-ipt 


We form the conjugates of a, defined by 


2 = Go + Aip2 + A2p} + + + + 1 An—1p37" 
Pee ae Ne ores (8.40) 
On = Ao + Aipn + Gop, + °° + + Gn-ipn 


Certainly f(x) = (@ — a1)(% — 22) +: : (t — an) has a; as a zero. We 


GROUP THEORY AND ALGEBRAIC EQUATIONS 327 


need only show that the coefficients of f(z) arein F. Now 


n 
f(x) = a ‘iedinecs (Y 0) grt + e ee + (—1)"aja2 . @ @ Qn 
j=l 
Moreover Ya,, La,a,, ete., are certainly symmetric in the p,, 7 = 1, 2, 
. ,n, and so can be written as polynomials involving the fundamental 
symmetric functions, 


Pi + p2 ee MEPS + Pn 
ee (8.41) 
pip2 °° * pn 


The elements of (8.41) are simply the coefficients of p(x), except for some 
negative factors, as seen by expanding (2 — pi)(@ — p2) °° * (& — pn). 
Thus the coefficients of f(x) are in F. 


Example 8.18. Referring to Example 8.17, we consider a = a1 = 3 — 4/2. 
Then a. = 3 + 4 1/2, and 


f(z) = (x — a)(r — a2) = (4@ —-3 +4 V2)(2 — 8 — 4 V2) 
= (r — 3)? — 32 
= 2? — 6r — 23 


The coefficients of f(x) are in the field of rationals. 
Example 8.19. J.et us find the principal equation of a = 1 + p?, where p is a zero 
of p(x) = 23 + 32 +2. Let the zeros of p(x) be p = pi, pe, ps. Then 


f(z) = [xe — (1 + ppile — (1 + pile — (1 + 99)] 
x8 — [3 + pi + 0} + esle? +111 + of)(1 + 03) + CL + 07) F+ 93) 
+ (1 + p3)(1 + 93)] — CL + pf)(1 + 3) (1 + 03) 
From o1 = pi t+ p2 t+ ps = O 
o2 = pips + pops + p3p1 = 3 
03 = pipep3; = —2 


we have 


pi + pz + os = (01 + p2 + ps)? — 2(o1p2 + pops + psp) 
| ee ee eee 


Also (1 + p})(1 + 02)(1 + pj) = 1 + 05 — 202 +65 — 201s + 03 = 8 (see Prob. 1, 
Sec. 8.10). Moreover 


(1 + p2)(1 + p2) + (1 + 2) (1 + op?) + 1 + 3) H+ 0?) 
= 3 + 2(of + p + 03) + (o1p2 + pips + psp2)? — pip2ps(o1 + p2 + ps) 
=3+2(-6) +9 =0 


Hence f(z) = 2? + 322 — 8. We check that 1 + p? satisfies 


(1 + p?)? +3(1 + p?)? —-8 =0 
We have 
f(L + p?) = p® + 6p4 + 9p? — 4 
= (—2 — 3p)? + 60(—2 — 3p) + 9p? —4 = 0 


since p? + 3p + 2 = 0, or p? = —2 — 3p. 


328 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problems 


1. Show that z? + 2 + 1 = 0 has no rational roots. Hint: Assume x = m/nis a 
zero of p(x) = 23? +2 +1, mand n integers in lowest form Show that of necessity 
m= +1,n= +1. By testing c = +1 show that p(x) has no rational zeros Why 
is p(z) irreducible in the field of rationals? 

2. The polynomial p(x) = x4 + 23 + 2x2 + 2+ 1 has no rational zeros. How- 
ever, p(x) is reducible in the field of rationals since 


gaot+a34+ 22? +741 =? +e 4+ 1)’ 41) 


Is this a contradiction? 

3. Let p be a zero of p(x) = 23 ++ 2 + 1 (see Prob. 1). Express 1 + p + p?/1 — p 
+ p? as a second-degree polynomial 1n p 

4. Let f(z) =a. tayxr+-:-++ +anz", a, 1 = 0, 1, 2,...~, n, integers If a 
prime number p exists such that p does not divide an, » divides a, for 2 <n, p? does 
not divide ao, then f(x) is irreducible in the field of rationals. This criterion 1s due to 
Eisenstein. Show that x” — pisirreducible in the field of rationals, pa prime. Show 
that fiz +1) = 27? +2r +2 = (x +1)? +: 1 15 irreducible in the field of rationals. 
Why can one immediately make the same statement for f(z) = 7? + 1? 

5. Consider p(x) = x4 + 3224+ 9. Let the zeros of p(r) be p, —p, 6, —c Show 
that the principal equation of a = 1 + 6p + p21s f(x) = [(z — 1)? — 8I]?. 

6. Consider z? — 2 = 0, 23 — 2 = 0, and obtain the field (4/2, ¥/2). 

7. A polynomial is said to be separable if it has no multiple zeros Show that if 
p(x) is irreducible it 1s separable. 

8. Consider f(z) = (& — a;)(4@ — ae) - (xr — an), the a, defined by (8 40) — If 
f(x) is irreducible, then f(z) = [f(z)}. If f(z) 1s reducible in F, then 


T(x) = filr)fo(z) + > + f(x) 


Assume f(x) 1s a function for which fi(ay) = 0, so that fi(ai(x)) 18 zero for x = p4. 
Why does p(r)/fi(ar(z))? Hence explain why f(ai(x)) is zero for r = pi, pea, . - 5 pny 
so that f(a1) = flaz) = - ++ =flan) =O. If the a are not distinct, show that 


f(x) = [fil)*, otherwise f(z) = fi(z). 

8.13. The Galois Resolvent. Let f(z) be a principal equation of 
x =a. We say that a is a primitive number of F(p) if f(x) is irreducible 
in F, 

THEOREM 8.16. If a = @, is a primitive number of F'(p), then 


F(a) = F(p) 
that is, every number of F(a) belangs to F(p), and conversely. 
Proof. Let f(a) = (x — ai)(2 — ae) + + + (& — an) be the principal 


equation of a = a,. Define g(x) by 


o(t) = f(z) (2 +o Mf eH e 


L-a L£-— a 
The p,, 7 = 1,2,... ,, are the conjugate zeros of p(x). Then 
g(a1) = pila1 — @2)(a1 — as) + * * (a1 — On) = pif’(a1) 

Since f(z) is assumed irreducible, f’(a1) # 0 (see Prob. 6, Sec. 8.12). 


GROUP THEORY AND ALGEBRAIC EQUATIONS 329 


Thus pi = ¢(a,)/f’(a1), so that p; is a number in F(a,).. Any number of 
F(p,) thus belongs to F(a;).. Why? Since a; is a polynomial in pi, it 
follows that any number in F(a) is a number in F(p;).. Q.E.D. 
THEOREM 8.17. Let p; be a zero of p;(x), o1 a zero of po(x), pi(x) and 
p2(x) irreducible in F. We can form the fields F(pi1), F(oi). If po(x) 
is reducible in F'(p,), we consider the irreducible factor of pe(x) having 
g,; as a zero. In this way we can form F(,, 01) (see Example 8.17). 
Let the reader show that F(:, 01) = F(o1, pi). Weshow that an irreduc- 
ible polynomial in F exists yielding a field /(7) such that F(r) = F(p1, 1). 
Proof. Let 
T1 = api + Boy 
T2 = ap; + Bor 
Tr = api + Bo, 
Tn+1 = Ape ae Bo, 


ee ee e¢© e© e© je je je «e# 


Tnm = ApPm + Bon 


with a and Bin F. The p,, 71 = 1, 2,... , m, are the conjugates of 
p(x) = 0, and thes,,7 = 1, 2,... , n, are the conjugates of po(x) = 0. 
We choose a and 6 so that the 7,,7 = 1, 2,... , mn, are all distinct. 
This can be done since, if ap, + Bo, = ap, + Bo), then 





a O01 — Of ‘ 
—_- = — k ° 
faeries t~#k (8.42) 


There are only a finite number of ratios in (8.42), so that a and B can be 
chosen such that (8.42) fails to hold for allz #k,j7 #l. For 2 =k we 
have Bo, ~ Bo, if o, Xo). Fory = 1 we have ap, ¥ ap, fort ¥ k. Next 
we form 


g(x) 


(a — 71)(@ — T2) * * * (LX — Tmn) 
agmn _ (Zr, )gne + af ene + (—1)™"ry72 mS a ae a (8.43) 


Any interchange of two p’s leaves the coefficients of g(x) invariant, as 
does any interchange of two o’s. Thus the coefficients of g(x) are in F. 
That factor of g(x) irreducible in F having 7; as a zero generates F'(r;) 
with 7; = ap; + Bor. Since7: = ap; + Bou, F(71) is a subfield of F(p;, 03). 
We now show that any number 7 of F(1, o1) belongs to F(r1) so that 
F(r1) = F(pi, 01). If 7 is in F(p1, 01), n is of the form 


n = 1 = (do + Qip1 + °° + + Gm—1p7—?) 
+ (bo + Oip1 + tt 4H Om_1p?')oy 
+ ess + (Co + eip1 +t 4 Om—ap?—)ot-! 


n—1m—1 
as 4 J 
= dij pio} 
7=0 i=0 


330 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


with the d,,in F. The mn — 1 other conjugates of p; can be formed by 
replacing pi by pz, p3, . ~~. , pm and a; by a2, a3,.. . , on 1n all possible 
ways. We define ¢(x) by 


s se Pees ss _ mn 

e(t) = g(t) (. =o oe es a (8.44) 
Let the reader show that 71 = ¢(71)/g'(r1), g'(71) ¥ 0, since g(x) = 0 has 
no multiple roots. Hence 9; belongs to F(7,). Q.E.D. 


Example 8.20. We consider x? ~ 2 = 0, r? — 3 = 0 with p1 = V2, 01 = V3. 
We form = V2 + V3, 12 = V2 —- V3,073 = — V24+-V3, = — V2 - V3. 
No two of the 7’s are equal. Then 

q(x) = (© — 11)(@ — 72)(@ — 73)(@ — m4) = cA — 1027? 4+ 1 


The coefficients of g(x) are in the field of rationals. It can be easily shown that g(z) 
is irreducible in F. Thus g(x) = O generates F(+/2 + +/3), which has elements of 
the form 


a+ b(V2 + V3) te(V2 + V3)? +d(V24+ V3) =a tyvV2 +uvV3 +0V6 
r, y, u, vrational. Thus F(«/2 + /3) = F(+/2, V3) (see Example 8.16). 


A special case of Theorem 8.17 occurs if pi(xz) = pe(x) and p; ¥ o1 = po. 
The field F(p,, p2) can be generated. Continuing, we can adjoin to F all 
the roots of p(x) = 0, obtaining F(p1, pz, . ~~. , pn). A single number + 
exists such that F(r) = F(p1, po, ... , pn). The irreducible polynomial 
having 7 as a zero is called the Galois resolvent of p(z) = 0. It is a poly- 
nomial of degree S n!, since we need only find constants {a,} such that 
T= > a,p* and the other 7’s obtained from the n! — 1 permutations of 

k= 
the p’s are different from each other. The rest of the proof proceeds as 
in Theorem 8.17. 


DEFINITION 8.8. If pi, po, . .. , pr are the zeros of the irreducible 
polynomial p(x) such that 
F(p1) = F(p2) = + > = F(pn) 


then p(x) or p(x) = 0 is said to be normal and F(p) is called a normal 
extension of F. In this case F(pi) = F(p1, po, .. . , pn) and p(z) is its 
own Galois resolvent. 


Example 8.21. Consider p(x) = x3 — 3x +1 =0. Let p bea zero of p(x). By 
division xz? — 32 +1 = (x — p)(x?, + px + p? — 3). The other two roots of 
p(x) =O are pie = (—-1 + -VW12 — 3p2)/2. We show that 12 — 3p? is a perfect 
square in F(p). Assume 12 — 3p? = (a + bp + cp?)2, a, b, crational. Using the fact 
that p? = 3p — 1, p4 = 3p? — p, we have 12 = a? — 2bc, 0 = —c? + 2ab 4+ 6bc, 
—3 = b? + 3c? + 2ac. It is easy to check that a = 4,6 = —1,c = —2isa rational 
solution so that p1 = 2 — p — p*, pz = p? — 2, which implies F(p) = F(p:) = F(p:), 
and p(x) is normal. 


GROUP THEORY AND ALGEBRAIC EQUATIONS 331 


Problems 


1. Consider p;(x) = 2? — 2 = 0, p(x) = x4 — 2 = O, with pi(z), pe(z) irreducible 
in the field R of rationals. Find R(/2, W/2) 

2. Find a Galois resolvent for p(x) = x3 + 2x + 1. 

3. Show that p(x) = r? — 3(r? + 38?)r + 2r(7? + 35s?) is normal for r and s integers 
provided p(2) is irreducible in the field of rationals. 

4. The discriminant of p(r) = x3 + pr +qis A = —4p? — 2742. Show that the 
discriminant of p(x) of Prob. 3 is a perfect square. 

5. Show that 91 = ¢(7r1)/g’ (71), o(x) defined by (8.44). 


8.14. Automorphisms. The Galois Group. Let us consider the field 
R(+/2) composed of all elements of the form a + b +/2, a and b rational. 
We consider the one-to-one correspondence between a + b +/2 and its 
conjugate, a — b+/2, written at+b~W/2oa—bv/2. Let us look 
more Closely at this correspondence. If 


etyV2e2r-yvV2 
utov~Wou-vvVy?2 
then 


(a tyV2)+ (utvvy2) =(r#+u + (yt?) V2 - 
(r+ u) — (yt) V2 
= (4 —y V2) + (u —v V2) 
(x ty V2)(u +0 V2) = (cu + 2yr) + (av + yu) V2 
<> (xu + 2yv) — (cv + yu) V2 | 
= (x —y V2)(u — 9 V2) 


A one-to-one mapping of a field into itself, written a a’, ora’ = f(a), 
a = f(a’), such that aera’, Bop’, imply a+ Boa’ + Bp’, abo 
a'B’, for all a and £ of the field, is called an automorphism of the field. 
From above we see that the mapping a + b~/2a4a — b 7/2 is an auto- 
morphism of R(+/2). Every field has at least one automorphism 
attached to it, the identity mapping, a < a, for all a of the field. Not 
all one-to-one mappings of a field into itself yield an automorphism. 
Witness the mapping x <> 22, with x ranging over the field of real num- 
bers. For y < 2y we have ry © 2(ry) # (2x)(2y). Under an automor- 
phism the zero element maps into itself, as does the unit element, for 
aca’, 0x, imply a+ 0 =ava’+2=a’, so that x = 0, and 
aca, loy,implya:1 =aea’y = a’,sothaty = 1. We say that 
0 and 1 are left invariant under any automorphism. Let the reader show 
that the rationals of a field F remain invariant for all automorphisms of 
F. It is also a simple matter to show that the automorphisms of a field 
form a group. If a = fi(8), B = fe(y) are automorphisms A, and Ag, 
we define A,Ae as the automorphism a = fi(fo(y)) = f(y). Let the 
reader show that under this definition of multiplication the set of auto- 


332 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


morphisms of F form a group. The unit element of the group is the 
identity automorphism. 

Let us return to the irreducible polynomial p(x) with coefficients in F. 
We shall be interested in the extension field N = F(p1, po,.. - , pn) with 
p,yt = 1,2,...,n, such that p(p,.) = 0. We consider only those auto- 
morphisms of F'(p1, po, . . . , pn) Which leave the elements of F invariant. 
The identity automorphism is one of thistype. Let the reader show that 
the set of all automorphisms of F(p1, p2,.. . , pr) leaving the elements 
of F invariant form a group. 


DEFINITION 8.9. The group G of automorphisms of F'(p1, po, . . - , pn) 
leaving F invariant is called the Galois group of p(x) = 0, or the Galois 
group of F(pi, pz, . +» 5 pn) relative to F. 


THEOREM 8.18. If p(x) is normal (see Sec. 8.13), the Galois group of 
p(x) contains exactly 1 automorphisms. 
Proof. First of all we have 


F'(pi, pz,» © 5 Pn) = F'(p1) = F(p2) = + + * = F(pn) 
since p(x) is assumed normal. If p(x) = » ay, *, then a, <> a; for any 
k=0 
automorphism of the Galois group G. Let p: correspond to o for any 
n n 


automorphism of G. Then 0 = > arp; <- > a,o* = 0, so that o must 
k=0 k=0 

be a zero of g(x). Thus (9: — pi), (p1 <> pz), . . - , (91 pn) are the 

only possible elements of G. We must remember that (9; < p,) com- 

pletely determines an automorphism of G since any element of F(p;, po, 


n—-1 


. , pn) is of the form > b.pi, with bk, k = 0,1,2,...,n—linF. 
k=0 


If p(x) were not normal, we could not make this statement. If p(x) is 
not normal, the order of G is less than or equal to n! Why? Finally, 
we shall wish to use the faet that the Galois group G can be made iso- 
morphic to a regular permutation group (see Theorem 8.4). 

THEOREM 8.19. Let p(x) be normal. If ais any element of F(p = p:) 
which remains invariant under all automorphisms of G, then a is an ele- 
ment of F., 

Proof. We know that all elements of F remain invariant under G. 
We wish to prove the converse if p(x) is normal. Let a = a, be an ele- 
ment of F(p;), and consider its conjugates. Then 


ay = bo + dip: + bop? + + + bn—1pt—} 
an = by + dips + bop? +--+ + bn—1pr—} 


e e @ @®© © @  *e* ee ee ee 8  e« # e« e® ee ee 28 e ee ee .» e# 


On = bo + Dipn + bop? + + +s + bn_ipt 


GROUP THEORY AND ALGEBRAIC EQUATIONS 333 


and f(x) = (4% — a1)(% — a2) +: + (& — a,) is the principal equation of 
a = a, (see Theorem 8.15), and the coefficients of f(z) are in F. But 
a, =a= :** =a, since q, is left invariant under G. Thus 

f(z) = (@& — a)" = 2* — nax™'! + +++ + (-1)"an 


so that —na 1s in F, which implies that a isin F. Q.E.D. 
DEFINITION 8.10. If the Galois group is cyclic, the equation p(z) = 0 
is said to be cyclic. 


THEOREM 8.20. If p(x) = a,x* = 0 is cyelic and normal, we can 
k=0 

solve for the roots of p(z) = 0 by a finite number of rational operations 
(addition, multiplication, etc.) and root extractions on elements of F(w), 
where w"” = 1, w # 1, argw = 2r/n. The degree of p(x) is n, and the 
coefficients of p(x) are in Ff. It can also be shown that F’(w) can be 
generated by a finite number of rational operations and root extractions 
on the elements of F. We omit proof of this latter statement. 

Proof. Since p(x) is normal, G is composed of exactly n elements. 
Moreover G is assumed to be cyclic so that a single element generates G. 
We represent this element by the permutation P = E a ; . Let 


us consider the set of elements 


Qn-1 
& = pitpetpst+ ins +p, = -— 


An 
tee Ww 9 + w- . + 3 : a qwrt 
fo = pi + wp Ps Pn (8.45) 


Es = py + w*ps + wips + °° > + wp, 


o @ © &© © &@  &@ © &@ & @ © ell lw 


En, = pi + w"— lp» + We") pg + o 8 e + wee-D3y, 


The &, 7 = 1, 2,..., ”, certainly belong to F(p, w), p = pi. Now 
P(&) = po twos + - +> +w'p: = &/w, considering w as a parameter 


in & Thus [P(é2)|" = & since w™ = 1. But [P(&)]” = P(&) since 
raising £ to the nth power and then permuting the indices 1,2, ...,n 
is equivalent to first permuting the p’s and then raising the new entity 
to the nth power. Thus P(é3) = &so that & is invariant under P of G. 
From P2(#) = P[P(&)] = P(&) = &, etc., we see that & is invariant 
for all elements of G. The same result applies to &, &, &, ..., & 
Now 


ES = Ao(p1, p2, » + + 5 pn) 
=i a1(p1, a » Pn) W iy a = An—1(p1, P2, + + + paw 


by direct expansion of &, since w* = 1. We wish to show that ao, 
Qi, . ++, Qn-1 are in F so that & is in F(w). Under any permutation 
of G we have ao Qj, @1 Qj, 2 4, . . . , An-1 7 G,_}. Since § is 


334 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


invariant under G, we have 


Qo + iW + Aqw? + sts + An ywrm 
=a, taw+ayw? +--+ +a, yw (8.46) 


for all w such that w* = 1. Now the equation 
g(x) = bo the +o + Oye! = 0 


has exactly n — 1 roots. Equation (8.46) is of this type yet has n dis- 
tinct zeros, namely, 1, w, w?, ... , w"7!. Thus (8.46) can only be true 
for all w satisfying w" = 1 if ao = a), @1 = a}, ...,@n-1 = Q,-1. Hence 
Qo, @1, . . . , Qn_) are invariant under G, so belong to F by Theorem 8.19. 
Hence & is in F(w). The same result applies to &, &, ..., &. The 
f&,7=1,2,... ,n, of (8.45) are nth roots of elements in F(w), w = e?*™, 

We now look upon (8.45) as a linear system of equations in the 


unknowns pi, pz, . - . , pr. This system has a unique solution provided 
the Vandermonde determinant 
] ] 1 mo rige 1 
l1 w w? . wr} 
D(w) = !|1— w? ur so gyta—D (8.47) 
1 wr 1 w2(n—) ca & wr)? 


is different from zero. Let the reader show that 


n—2 
Dw) = ll (wi — w’) 0,7 >7 
1=0 
3=0 
Hence the p,,z2 = 1,2, . . . ,n, can be expressed as linear functions of the 


\/£, with coefficients in F(w). Q.E.D. 


Example 8.22. In Example 8.21 we saw that p(z) = xr? — 32 +1 = 0 is normal. 
It 1s also cyclic since its Galois group is of order 3. Let the roots of p(x) = 0 be 
Pl, P2, P3. Then 

Er = pi + po + ps = O 
fo = pi + wp. + wp; (8.48) 


£3 = pi + wpe + wp; 
with w§ = 1, w #1, vw? +0+1=0. We must find &, & using the additional fact 
that pip2 + p2p3 + p3p1 = —3, pipeps = —1. Now 


ots = pi + pg + pi + (w + w?)(pip2 + p2ps + psp) 

= (p1 + pz + ps)? + (pi1p2 + pops + psp1)(w + w? — 2) 
= Y 

f+ & = 20) + pp + 3) — 3(pie2 + pge3 + 301 + pips + pip1 + p2p2) + 12pipsp3 
= 2(01 + pe + px)? — 9(o1 + pe + ps)(pip2 + p2p; + papi) + 27pip2ps3 
= —27 

(& — &)? = (+ &3)? — 45 = (—27)? — 4(9)3 = —2,187 
- & = —27 V3 


GROUP THEORY AND ALGEBRAIC EQUATIONS 335 


Hence & = —27 C ey g = —27 (= ‘). Note that ¢ = 9%. The 


cube roots of £3, & must be chosen so that f¢; = 9. Note that € and & belong to 
R(w) = R(/31). It is now a simple matter to solve (8.48) for pi, p2, p3, which we 
omit. 


Before concluding this chapter let us consider the following: We con- 
sider p(x) = x4 + 9, irreducible in the field of rationals. Let the roots 
of x1 +9 =0 be p, —p, o, —o. From p(—p)o(—c) = 9 we see that 
o = +3/p = +3p3/p* = FHp*. Hence p(x) is normal. The automor- 
phisms comprising the Galois group of p(x) are (pp), (p< —p), 
(p <0), (p< —o), and we can represent G by the permutation group 


1234 1234 1234 1234 
a fen Ps eet Pa eieca) Ps een 
with pi = p, p2 = —p, ps = 6, ps = —o. This group is not cyclic. The 
subgroup G; = (P,, P2) leaves invariant a certain number of elements of 
F(p). Let us see which of these elements are invariant under G;. Py 


leaves all elements invariant. If a = a + bp + cp? + dp? is left invari- 
ant under Pe, of necessity 


a+ bo + cp? + dp? = a + b(—p) + c(—p)? + d(—p)® 


which implies that b = d =0. Hence the elements a + bp? are left 
invariant under G;. It is easy to show that these elements form a sub- 
field F(p?) of F(p). G1 is called the Galois group of F(p) relative to F(p?). 
Note that F(p) D F(p’) D F. 

The principal equation of a + bp? is 


f(z) = [x — (a + bp*) [x — (a + bp?)][x — (a + bo*) [x — (a + bo?)] 
= [(v — a)” — 0(p* + o*)(@ — a) + b%p’o°}? 
= [(x — a)? + 9b?/? 


so that a + bp? satisfies a quadratic equation [of lower degree than p(z)] 
with coefficients in fF. Thus we can solve a quadratic equation for p’, 
obtaining p? in terms of rational operations and extractions of roots of 
élements in F. Another square root yields p. Of course we know all 
this in advance since it is trivial to solve for the roots of z* + 9 = 0. 
Let us note, however, that G in the above example is a maximal normal 
subgroup of G and that E is a maximal normal subgroup of G;. The 
factor groups G/G, and G,/E are of order 2 (a prime number). The fact 
that 2 is a prime number and that a group whose order is prime is cyclic 
(see Theorem 8.20 for the importance of this fact) leads to the all-impor- 
tant theorem, which we state without proof. 

THEOREM 8.21. If Gis the Galois group of an equation p(z) = 0 rela- 
tive to its coefficient field F, a necessary and sufficient condition that 


336 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


p(x) = 0 be solvable by radicals relative to F is that the factors of com- 
position of G be entirely primes. 

It can be shown that in general a polynomial of degree greater than 4 
cannot be solved by radicals. The equation 2° — 1 = 0, however, can 
be solved by radicals. Factoring, we have 


(c-—1)\(7*#+ a3 +277 +2741) =0 
Considering 74 + 22+ 22+2+1 = 0, let 


1 1 
oe ay CS ae 2 ay 


so that upon division by x? one has y? + y — 1 = 0. One solves for y 
and then for x from 2? — yx + 1 = 0. 


Problems 


. Solve the quadratic 2? + br + ¢ = 0 by the method of this section. 
. Show that the automorphisms of a field form a group. 

. Show that. D(w) of (8 47) 1s given by D(w) = I(wt — w'), 7 >2 

. Solve for the roots of x = 1. 

. Find the polynomial with coefficients in the field of rationals having 


p= V2 +1 


an - 6 BS FE 


aus & zero. 


REFERENCES 


Albert, A. A.: ‘Modern Higher Algebra,” University of Chicago Press, Chicago, 1937. 

———: “Introduction to Algebraic Theories,” University of Chicago Press, Chicago, 
1941, 

Artin, .: “Galois Theory,” Edward Brothers, Inc , Ann Arbor, Mich., 1942 

Birkhoff, G., and 8 MaclLane: “A Survey of Modern Algebra,” The Macmillan Com- 
pany, New York, 1941. 

Dickson, L. l.: “First Course in the Theory of Equations,” John Wiley & Sons, Inc., 
New York, 1922. 

—-—~~: ‘Modern Algebraic Theories,” Benj. H Sanborn & Co., Chicago, 1930. 

MacDuffee, C. C.: ‘An Introduction to Abstract Algebra,” John Wiley & Sons, Inc., 
New York, 1940 

Murnaghan, F. C.: ‘The Theory of Group Representations,”’ Johns Hopkins Press, 
Baltimore, 1938. 

Pontrjagin, L.: ‘Topological Groups,” Princeton University Press, Princeton, N.J., 
1946. 

Van der Waerden, B. L.: ‘‘Modern Algebra,” Frederick Ungar Publishing Company, 
New York, 1940. 

Weisner, L.: “Introduction to the Theory of Equations,” The Macmillan Company, 
New York, 1938. 

Weyl, H.: ‘The Classical Groups,” Princeton University Press, Princeton, N.J., 1946. 


CHAPTER 9 


PROBABILITY THEORY AND STATISTICS 


9.1. Introduction. It appears that the mathematical theory of prob- 
ability owes its formation to the inquisitiveness of a professional gambler. 
In the seventeenth century Antoine Gombaud, Chevalier de Mere, pro- 
posed some simple problems involving games of chance to the famous 
French philosopher, writer, and mathematician, Blaise Pascal. It 1s to 
mathematicians Pascal and Fermat that probability theory owes its 
origin, though since their early works a great number of mathematicians 
have contributed to its development. Its applications range from sta- 
tistics to quantum theory. An understanding of probability theory is 
necessary to undertake studies of the modern theory of games and infor- 
mation theory. It will be useful, however, to consider some elementary 
combinatorial analysis before we attempt a definition of probability. 

9.2. Permutations and Combinations. Let us assume that we have a 
collection of white balls numbered 1 to m and a collection of red balls 
numbered 1 to n. In how many ways can we choose exactly one red 
and one white ball? To each white ball we can associate n red balls. 
Since there are m white balls, it appears that there are m:n ways of 
choosing exactly one white and one red ball. In this example we note 
that the choice of a white ball does not affect the choice of a red ball, 
and conversely. The events are said to be independent. We state with- 
out formal proof the following theorem: 

THEOREM 9.1. If there are m ways of performing a first event and n 
ways of performing an independent second event, there are m+n ways of 
performing both events. 

The use of Theorem 9.1 enables us to solve the following problem: 
Given a group of objects numbered 1 to n, in how many ways can we 
order these objects? We have n choices for the object which is to be 
placed first in our order. After a choice has been made there are n — 1 
choices for the object that is to be placed second in the order. Con- 
tinuing, we see that there are n(n — 1)(n — 2) +--+ 2:1 = 7! ways of 
arranging or permuting the objects. If we wish to consider the number 
of arrangements or permutations of n objects taken r at a time, we arrive 


at the answer n(n — 1) -- > (n —r+1). We write 
227 


338 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


ne, = ®P, = P® = n(n — 1)(n —2)+°> m-rt) 


mn-—1)-+:(m—-r+ti(nm—r)--+: 1 
| hUM rn —r—l)::+]1 
“ea 


Example 9.1. How many four-digit numbers can be formed from the integers 
1,2, ...,9, using any integer at most once? We have P} = 91/5! = 3,024 ag our 
answer. If we were to use the integers as many times as we pleased, our answer would 
be 94 = 6,561. 

Example 9.2. Given 3 red flags, 4 white flags, and 6 black flags, how many signals 
can we send using the 13 flags? Let us assume for the moment that we can dis- 
tinguish between the red flags by numbering them 1, 2, 3, with the same statement 
concerning the white and black flags. It 1s then obvious that 13! different signals 
could be sent. Now the red flags can be permuted in 3! = 6 ways. Hach permuta- 
tion, however, yields no new signal if the red flags are indistinguishable. We must 
divide 13! by 3! to account for this characteristic of the red flags. By considering the 
white and black flags we arrive at the correct answer, 13!/3!4!6!. 


The permutations of the integers 1, 2, 3, 4 taken two at a time are 
listed below: 


12 13 14 23 2 4 34 
21 3 | 4+] 32 42 43 


A particular arrangement of objects considered independent of their 
order or permutation is called a combination. We see that there are 6 
combinations of the integers 1, 2, 3, 4 taken two at atime. Although 12 
and 21 are different permutations, they yield a single combination. If 
we wish to hire 3 secretaries from a group of 12, we are usually not 
interested in the order of hiring the secretaries. Now let us see in how 
many ways we can choose r objects from among n, the order of the choices 


being immaterial. Call this number ,C, = "C, = Ct = (") Every 


combination will yield r! permutations. Thus "C, r! = "P, so that 
Ceae ss (9.2) 
Trin — r)! 


Note that 1 = 6 = n!/n!0! so that 0! is chosen to be 1. What does 
"Co signify? 


Example 9.3. In obtaining 5 cards from among 52 cards in draw poker we are not 
interested in the order in which the cards are dealt. Thus there are 5°2C; combinations 
of cards which can be dealt. Let the reader show that %C; = 2,598,960. Let us 
determine how many full houses (three of a kind and a pair) can be dealt. If we con- 
sider the particular full house consisting of a pair of fives and three aces, we note that 


PROBABILITY THEORY AND STATISTICS 339 


there are 4; = 4 ways of obtaining three aces. Thus there are 6-4 = 24 different 
combinations yielding three aces and a pair of fives. But there are 13 choices for the 
rank of the card yielding three of a kind, and then 12 choices remain for the choice of 
the rank of the pair. Thus there are 24-13-12 = 3,744 different full houses that can 
be dealt. 

Example 9.4. Let us consider a rectangular m X n grid. If we start at the lower 
left-hand corner and are allowed to move only to the right and up, how many different 
paths can we traverse to reach the upper right-hand corner? All told, we must move 
a total of m + blocks. Once we choose any m blocks from among the m + n blocks 
for our horizontal motions, of necessity, the remaining n blocks will be vertical 
motions. Thus there are ™1**C,, = (m + n)!/m!n! different paths. Note that the 
answer is symmetric in m and n. 

Example 9.5. Let us consider (x + y)", n a positive integer. In the product 
(cxty(eat+y) +--+: + y) we note that terms of the type x27y" will occur. Since a 
term from each factor must be used exactly once, of necessity, 7 +s =n. The term 
xy? can occur in exactly "C, ways since we can choose x from any r of the n factors. 


e n 
U 
Thus (x 4+ y)” _ » ©) ary, If we choose xr = y= l, we have 2" = > (*): 
7=(@) pee 


tl 


MN n 
From (1 + 2)" = V (") 7, (1 +r) = y @) xr’, we have 


hy 
r,=(0 s=0 
2n 9 n n 2n k 
woot N ons NON Ft TONG is, oe VN {" n 
a 2 (7) + ~ Ly ky ee ae 2), L (") a 7 
k=0 s=0r=0 k=0,=0 
Equating coefficients of x* yields 
k 
2n\N NY fn n 
h = I h See 
r=0 
Problems 
1. In how many ways can an ordered pair of dice be rolled? Ans. 36 
2. Which number will occur most often when a pair of dice is rolled? Ans. 7 
8. How many handshakes can 10 people perform two at a time? Ans. 45 
4. How many five-card poker hands can be dealt consisting of exactly one pair 
(other three cards different)? Ans. 1,098,240 


5. Considering the grid of Example 9.4, show that the grid contains 


(m + 1)(n + 1)mn 
4 


rectangles. 
6. Consider a function of p variables. How many nth distinct derivatives can be 


formed? Hint: The grid of Example 9.4 1s useful if we designate the horizontal lines 
DY i, Dey « 4 25) Ep. Ans. *tP-1C,_1 


7. By difteroutiating (140) show that (") z ( - ) - (" 7 ‘). 


r—l 
n 
8. Show that > (—1)" (*) a. 


r=(Q) 


340 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
9. By considering (1 + 2)"(1 — x)" = (1 — z?)" show that 


2k 


¥ car (2) (at) er (t) 


r=0 


10. In how many ways can the integers 1, 2, 3, 4, 5 be ordered such that no integer 
corresponds to its order in the sequence? For example, 21453 1s such an ordering, 
while 21354 is not, since 3 occupies the third position in the sequence. 


] 1 ] 1 
Ans 5! Gi —= 31 + 


4! 5! 


9.3. The Meaning and Postulates of Probability Theory. Whereas the 
psychologist, economist, political scientist, ete., can, with some measure 
of success, communicate their ideas to the layman, the mathematician, 
unfortunately, realizes that he has no hope of describing his chief math- 
ematical fields of interest with a reasonable prospect of bemg understood. 
A modicum of hope prevails when we consider probability theory from 
its most elementary viewpoint. It is rare indeed to find a man who has 
not evinced interest at one time or another in breaking the bank at 
Monte Carlo. However, one need only consider the vast number of 
persons who believe that someday they will discover an invineible sys- 
tem of gambling, to realize how little the layman actually understands 
the basic ideas of probability theory. Probability theory has been called 
the science of common sense. If this is the case, it 1s strange indeed that 
two persons using common sense can differ so greatly from each other in 
solving a problem involving probability theory. 

Let us now investigate the meaning of the word ‘‘ probability.” Web- 
ster’s New International Dictionary states that probability is ‘the like- 
lihood of the occurrence of any particular form of an event, estimated as 
the ratio of the number of ways in which that form might occur to the 
whole number of ways in which the event might occur in any form.’”! 
The same dictionary defines ‘‘likelihood”’ as ‘‘ Probability; as, it will rain 
in all likelihood.”! Let us consider the question, What is the probability 
that it will rain on Sept. 9, 1957, in Los Angeles? ‘To some mathemati- 
cians this question is meaningless. There is only one Sept. 9, 1957. We 
cannot compute the ratio of the number of times it has rained on Sept. 9, 
1957, to the total number of days comprising Sept. 9, 1957, rain or shine, 
at least not before Sept. 9, 1957. After Sept. 9, 1957, the ratio would be 
zero or 1 depending on the weather. It is up to the meteorologist to give 
us a precise answer. Lither it will or it will not rain on Sept. 9, 1957, in 
Los Angeles. If there is no front within 1,000 miles of Los Angeles on 
Sept. 8, 1957, the meteorologist would predict no rain with a high degree 


1 By permission. From “ Webster’s New International Dictionary,’’ Second Edi- 
tion, copyright, 1934, 1939, 1945, 1950, 1953, 1954, by G & C. Merriam Company. 


PROBABILITY THEORY AND STATISTICS 341 


of “probability.” But the word ‘“‘probability’’ would not be used as a 
mathematician defines probability. Let us now ask, What is the prob- 
ability that a five occur if a die be rolled? This question is, in a sense, 
very much like the question asked above. Theoretically, if we knew the 
initial position of the die and if we knew the stresses and strains of the 
die along with the external forces, we could predict exactly which of the 
six numbers of the die would occur face up. The same statement can 
be made about the weather. Unfortunately for the meteorologist there 
are too many variables involved in attempting to predict the exact state 
of the weather. The hope of the meteorologist is to reduce the number 
of relevant factors to a minimum in an attempt to predict the weather. 
The die problem differs from the weather problem in the following way: 
If we have the patience and time, we can continue to roll the die as often 
as we please. If after » throws we note that the number five has 
occurred r, times, we can form the ratio r,/n. One would be naive in 
calling r,/n the probability of rolling a five, even if n is large. There 
are some who would define the probability of rolling a five as 


p = hm cs (9.3) 


The limit of a sequence cannot be found unless one knows every term of 
the sequence (the nth term of the sequence must be given for all n). To 
compute (9.3), one would have to perform an infinite number of exper- 
iments, an obvious impossibility. 

Let us turn, for the moment, to the science of physics. Newton’s 
second law of motion states that the force acting on a particle is propor- 
tional to the time rate of change of momentum of the particle. The 
particle of Newton’s second law of motion is an idealized point mass. No 
such mass occurs in nature. This does not act as a deterrent to the 
physicist. The motion of a gyroscope is computed on the basis that ideal 
rigid bodies exist. The close correlation between experiment and theory 
gives the physicist confidence in his so-called laws of nature. The math- 
ematician working with probability theory encounters the same difficulty. 
He realizes that a die or a roulette wheel is not perfect. Man, however, 
has the ability to make abstractions. He visualizes a perfect die, and, 
furthermore, he postulates that if a die be rolled all the six numbers on 
the die are ‘‘equally likely” to occur. It is, of course, impossible to 
prove that the occurrence of each number of a die is equally likely. 
With these idealized assumptions and definitions the mathematician can 
predict the probability of winning at dice. The success of the gambling 
houses in Las Vegas is sufficient evidence to the professional gambler that 
the idealized science of probability theory is on firm ground. The pure 
mathematician is not interested in games of chance, per se. Probability 


342 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


theory, to the pure mathematician, is simply a set of axioms and defini- 
tions from which he derives certain consequences or theorems. We now 
consider the axioms of probability theory. The reader is urged to read 
Sec. 10.7 concerning the union, intersection, complement, etc., of scts. 
We shall find it advantageous to consider a simple example before treat- 
ing the general case. 

If a pair of dice are rolled, the following events can happen as listed 
below: 
(1, 1) (2.-51,) (3, 1) (4, 1) (5, 1) (6, 1) 
(1, 2) (2.2) (3, 2) (4, 2) (5:2) (6, 2) 
1,3) (2,3) — (3, 3) (4, 3) (5,3) — (6, 3) 
(1,4) (2,4) (8,4) = (4, 4) (5,4) (6, 4) 
(1, 5) (2, 5) (3, 5) (4, 5) (5, 5) (6, 5) 
(1,6) (2,6) = (3, 6) (4,6) (5,6) = (6, 6) 


E is the collection of all possible events. The elements of FE are called 
elementary events. A subset of H, say, f;, is the set of events guarantee- 
ing the occurrence of the number 1 on at least. one of the two dice. 


Py: (1, 1), (I, 2), (1, 3), (1, 4), (1, 5), 
(1, 6), (2, 1), (3, 1), (4, 1), (5, 1), (6, I) 


Another subset of # is the set FP. consisting of the events (1, 2), (2, 1), 
and (6,6). The union, or sum, of /; and F» 1s the set of all events or 
elements belonging to either /’; or fF, or both, written PF; U Fe = Fy + Fo. 
In this example F, + F2 consists of all the elements of F; plus the event 
(6,6). The intersection, or product, of PF; and Fe, written PF, OF. = Fy, 
is the set of elements belonging to both PF; and Ff». In this example FF, 
is composed of the elements (1, 2) and (2,1). If two sets ’; and F, have 
no elements in common, we say that their intersection is the null set, 
written F\F, = 0. The complement of a set F’ is the set of elements of 
E notin F. What is the complement of the set /, defined above? The 
complement of the full set # is the null set 0. Now let G be the set of 
all subsets of # including E and the null set 0. The elements of G are 
called random events. G is said to be a field of sets in the sense that the 
sum, product, and complement are again elements of G. Let us now 
attach to each element of E (these elementary individual events are also 
elements of G) the nonnegative number gz. There is no mystery as to 
the choice of the number yg. First we note that # is composed of 36 
elements. The assumption that all 36 events are equally likely implies 
that any single event has a probability of sg of occurring. If A is any 
subset of # consisting of r events, we define the probability of an ele- 
ment of A occurring as p(A) = 7/36. Thus P(F:) = 34, P(F2) = 35, 
P(E) = 1. We define P(0) = x, = 0. Thus to each set A of G we 


PROBABILITY THEORY AND STATISTICS 343 


have defined a real nonnegative number. If A and B have no element 
in common, then P(A + B) = P(A) + P(B). 

Formally, a field of probability is defined as follows: Let HE be any col- 
lection of elements z, y, z, . . . , which are called elementary events, 
and let G be a collection of subsets of HE. Assume that the following 
postulates or axioms are satisfied: 

I. G is a field of sets. 

II. G contains E and the null set. 

III. To each set A in G there corresponds a nonnegative real number, 
written P(A). The number P(A) is called the probability that an ele- 
ment of A will occur. 

IV. P(E) = 1, that is, one of the events of E is sure to happen. 

V. If A and B of G have no element in common, then 


PAB) = PCA a PAB) 


The single toss of a coin is a simple example of a field of probability. 
E is composed of the elements H (for heads) and T (for tails). G con- 
tains the sets H, T, E = (H, T), 0. We define p(H) = 4, p(T) = 5, 
p(H, T) = 1, p(0) = 0. p(E£) = 1 implies that a head or tail will cer- 
tainly occur. 

Events, such as tossing a coin, rolling dice, spinning a roulette wheel, 
yield finite probability fields since there are only a finite number of dif- 
ferent events which can occur. If 5 cards are dealt from a pack of 52 
cards, there are exactly °C; different events which can occur. It is 
logical to postulate that the probability of any single event occurring 
from among the *C different events be given by p = 1/°°C;. This is 
what is meant by an ‘‘honest”’ deal. 

It is important that the reader understand Postulate V. In rolling a 
die the event A = (1, 3) means that A occurs if a one or a three is face 
up on the die. The event B = (4) means that B occurs if a four is face 
up. Now A and B have no elements in common, that is, if A occurs, 
B cannot occur, and conversely. We say that A and B are mutually 
exclusive events. For a perfect die, p(1, 3) = %, p(4) = <, so that Pos- 
tulate V states that p(A + B) = %+¢%4%= 4. The probability that a 
one, three, or four occurs is 3, the expected answer. <A simple working 
rule for the reader is the following: 

Ruue 1. If A and B are mutually exclusive events with probabilities 
p(A) and p(B), respectively, then the probability that either A or B 
occurs is 


p(A + B) = p(A) + p(B) 


The reader should give an example for which p(A + B) < p(A) + 
p(B). 


344 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problems 
1. What is the probability of throwing a seven with two dice? Ans. = 
2. What is the probability of throwing a four, eight, or twelve with two dice? 
Ans. 


8. Five cards are dealt from a deck of 52 cards, Find the probability of obtaining 
the following poker hands: three of a kind, a flush, a straight. 
88 33 =10- 45 


Ans. 77165’ 16,660 0, 
4. A set of balls are numbered from | to 20. Two balls are drawn simultaneously 
atrandom. Whats the probability that their sum is14? Hznt: There are °C, = 190 


ways of drawing two balls. Ans. 35 
6. In Prob. 4 what 1s the probability that the sum of the two numbers is 14 if a ball 
is drawn and replaced and then a second ball is drawn? Ans. ov 
6. Five coins are tossed. What is the probability that exactly three heads occur? 
What is the probability that more than two heads occur? Ans. Pe, 
7. If n coms are tossed, show that the probability that exactly r heads occur is 
m1, /2", 
8. If a coin is tossed and a pair of dice are rolled, what is the probability that a head 
occurs and a total of 5 is rolled? Ans. +5 


9. Show that Postulates IV and V imply p(0) = 0 : 
10. Let the complement of A be denoted by A so that A + A = FE. Show that 


p(A) = 1 — p(A). 
11. Let A and B be elements of a field G. If p(A) # 0, define p(B) by 


pa(B) is called the conditional probability of the event B under the condition A 
Show that 

pB(A)p(B) = pa(B)pt) 
Give a geometric interpretation of pa(B), assuming A and B are point sets in a plane 


9.4. Theorem of Bayes. Let us consider the following simple example: 
Assume that one throws a dart at a 
target whose area is taken to be 1. 
We assume further that the dart is 
sure to strike the target and that the 
dart is equally likely to strike any- 
where on the target. This means 
that if A is any area of thetarget then 
the probability that the dart strikes 
a point of A is simply the area of A, 
written p(A). Now let A and B be 
overlapping areas (see Fig. 9.1). 
Let us consider a second person 
located in another room who knows 
that a dart will be thrown at the target. The probability that the dart 
falls in A will be given by p(A). Let us now assume that after the dart is 


Fia. 9.1 


PROBABILITY THEORY AND STATISTICS 345 


thrown our second person asks the following question: Has the dart fallen 
in the area B? He receives the truthful answer, ‘‘yes.”” How does this 
affect this person’s opinion as to the probability that the dart has also 
fallen in A? We can answer this question in the following manner: It is 
fairly obvious that if it is known that the dart lies in B then the only way 
it can also lie in A is for the dart to have struck the region A ()\ B, and 
the probability of this happening before it is known that the dart has 
fallen in B is simply the area of A ()\ B, written p(A (\ B). Since the 
dart is known to be somewhere in B, it appears that the ratio of the area 
of A ‘\ B to the area of B will yield the probability that the dart also 
liesin A (\ B. For example, if A () B is ¥ the area of B, then the chance 
that the dart is in A ( B if it is known that it is in B is simply 3. We 
write 


pe(A) = Ma p(B) #0 (9.4) 


and define pp(A) as the probability that the event A has happened if it 
is known that the event B has happened. We may also say that pp(A) is 
the conditional probability of the event A under the condition B. 

It is to be noted that (9.4) is a definition and requires no proof. Experi- 
mentally, however, one might attempt to verify (9.4) to some extent. 
Every time an observer learns that the dart is in B, he records whether 
the dart is in A or is not in A. He is not concerned with those experi- 
ments for which the dart lies outside of B. The ratio of the number of 
successes to the total number of tries (the dart must be in B) yields a 
fraction lying between zero and 1. For a large number of experiments 
it is reasonable to hope that this number is close to the number pz,(A). 

Since the roles of A and B can be interchanged, we have from (9.4) 


B(\ A) 
tp = pb 
VANE Ry 


Combining (9.4) and (9.5) yields one form of Bayes’s theorem, 


p(A) #0 (9.5) 





pe(A) = a (9.6) 


since p(B (\ A) = p(A(\B). p(A) is called the a priori probability 
that A occurs, whereas pz(A) is called the a posteriorc probability that 
A occurs under the hypothesis that B occurs. 


Example 9.6. A group of 100 girls contains 30 blondes and 70 brunettes. Twenty- 
five of the blondes are blue-eyed, and the rest are brown-eyed, whereas 55 of the 
brunettes are brown-eyed, and the rest are blue-eyed. A girl is picked at random. 
The a priori probability that she is a blonde and blue-eyed is 25 = 4. If we pick a 


girl at random and find out that she is blue-eyed, what is the probability that she is 


346 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


blonde? Let x denote the quality of being blonde and y denote the quality of being 
blue-eyed. We are interested in determining p,(x). Now Ply) = fox = 0.4, since 
40 of the 100 girls are blue-eyed. Moreover, p(x(\ y) = 4. Applying (9.4) yields 


pizf\y) _ 5 


py (x) = p(y) = 


Note that there are 25 bluc-cyed blondes and 15 blue-eyed brunettes, so that the 
Bppearee of a blue-cyed girl means that the probability of the girl being a blonde is 
4% = 3. In information theory one determines the ratio of the conditional proba- 
bility to the a priori probability and defines the amount of “‘bits’’ of information as 


2 


the logarithm (base 2) of this ratio In this example, J = logs = 03 = loge is = ]. 


We can extend the result of (9.5) as follows: Let # be a collection of 


elementary probability events A,, Ao, ..., An with the A,, 7 = 1, 
2,...,%7, mutually exclusive, so that 
B= A,+ Aot+ 7 + A, = WD Ae 


Now let B be anv subset of #. Then 


B=BOEHE=BOA,;+BOAct:+:: +BOV A, (9.7) 
Let the reader deduce that BO A,, 7 = 1, 2,..., n, are mutually 
exclusive. Thus 
P(B) = p(B Ay) + p(B’ As) + + + + + p(BO Aa) 
= > p(B A,) (9.8) 


=) 


Applying (9.5) yields 


n 


p(B) = ) p(A,)pa(B) (9.9) 


a=] 


Substituting (9.9) into (9.6) yields Bayes’s formula, 


BA a (9.10) 
p(A,)pa,(B) 


r=] 


It is important to realize that the event B need not be a subset of E 
in the ordinary sense. For example, the events A), Ae, ... , An might 
be different urns containing various assortments of red and white balls. 
The event B could be the successive drawing of three red balls from any 
particular urn, each ball being returned to its urn before the next draw- 
ing. Formulas (9.7) to (9.10) would still hold. By BO’ A; = A, MB 


PROBABILITY THEORY AND STATISTICS 347 


we mean the composite event of choosing A, and then drawing three 
successive red balls as described above. 


Example 9.7. Let us consider two urns. Urn I contains three red and five white 
balls, and urn II contains two red and five white balls. An urn is chosen at random 
(p(I1) = 4, p(I2) = %), and a ball chosen at random from this urn turns out to be red 
(p1(R) = 2, prux(R) = #). What is the a posteriori probability that the red ball came 
from urn I? Applying (9.10) yields 


p(1)pi(R) 2 8 21 


pl)pi(R) + pdlpu(Rk) 2-2 4+2-% ~~ 37 





pr) = 


Example 9.8. Two balls are placed in an urn as follows: A coin is tossed twice, and 
a white ball is placed in the urn if a head occurs, with a red ball placed in the urn if a 
tail occurs. Jet Ao, Ai, Az represent the events of the urn containing none, one, and 
two red balls, respectively. Balls are drawn from the urn three times in succession 
(always returned before the next drawing), and it is found that on all three occasions 
a red ball was drawn. What is the probability that both balls in the urn are red? 
Let B be the event of drawing three successive red balls. We are interested in com- 
puting pa(Ae2). Applying (9.10) yields 


PMA) pa(B) 
p(Ao)pa(B) + p(Ai)pai(B) + p(A2)pa.(B) 








pa(Ag) = 


From p(Ao) = 7, P(A1) = 3, p(A2) = F, pa,(B) = 0, pa,(B) = (g)* = % pa(B) = 1, 
we obtain ps(A2) = 3. Let the reader show that, if B, represents the successive 
drawing of n red balls, then pg, (Az) tendstolasn— «. Isthis reasonable to expect? 
We note that pa,(B) represents the probability of drawing three successive red balls 
from an urn containing one red and one white ball. We liken this problem to that of 
finding the probability of tossing three successive heads. The event space for this 
case is (H, H, H), (H, H, T), (H, T, H), (A, T, T), (7, H, H), (T, H, T), (T, T, A), 
(T, T, T), involving eight equally likely events. The a priori probability of the event 
(H, H, H)is = We note that, for a single toss of the coin, p(H) = 5: Is it reasonable 
to expect that p(H, H, H) = p(A)p(H)p(A)? This question is answered in the next 
section. 
Problems 


1. Three urns contain, respectively, 2 red and 3 black balls, 1 red and 4 black balls, 
3 red and 1 black ball. An urn is chosen at random and a ball drawn from it. If the 
ball is red, what 1s the probability that 1t came from the first or second urn? Ans. § 
2, Urn I contains two red and three black balls. Urn II contains three red and two 
black balls. A ball is chosen at random from urn I and placed in urn IJ. A ball is 
then chosen at random from urn II. If this ball 1s red, what is the probability that a 


red ball was transferred from urn I to urn II? Ans. ty 


9.5. Independent Events. Events Not Mutually Exclusive. If we 
rewrite formula (9.5), we have 


P(A (\ B) = p(A)pa(B) (9.11) 


Formula (9.11) states that the probahility that both A and B happen is 
the probability that 4 happens times the probability that B will happen 


348 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


if A happens. What can we surmise if p(A‘\B) = p(A)p(B)? Of 
necessity, P4(B) = p(B), so that the a priori probability of B happening, 
o(B), does not depend on the event A. We say that A and B are inde- 
pendent events. Two events are independent, by definition, if 


p(A (\ B) = p(A)p(B) (9.12) 


A set of probability events, A1, A2,..., An, are said to be mutually 


independent if 
P(A, Aj) = p(A.)p(A;) 1 GU (9.13) 


for?,j7 =1,2,...,n. 
If we consider, for example, the tossing of a coin and the rolling of a 
die, we obtain the event space of 12 elements, 


(7,1) (H,2)  (H,3) (H,4)  (H,5) (CH, 6) 
(T,1)) (7,2) (7,38) (M4) (P55) (TT, 6) 


We can assign the a priori probability of +, to each event. The prob- 
ability that a head occurs and a six occurs is taken to be 75. If we look 
upon the rolling of a die as completely independent of the tossing of a 
coin, we note that p(H) = 3, p(6) = %, and p(H 116) = 4°% = zs. 
Assigning the a priori probability of ys 1s equivalent to assuming that 
the two events are independent. 


Exomple 9.9. Adieisrolled n times. We compute the probability that a six occur 
atleast once. Thea priori probability of failing to throw a six on any given toss of the 
die is 3. If we assume that each successive roll of the die 1s independent of the 
previous throws, the probability of not obtaining a six in n throws 1s (¢)". Hence the 
probability of rolling at least 1 six in n tossesis 1 — (%)". Forn = 3, p < %, while for 
n=4,p> 7 

Example 9.10. The game of craps is played as follows: A pair of dice is rolled, and 
the player wins immediately if a seven or eleven occurs. He loses if a two, three, or 
twelve arises on the first throw. If a four, five, six, eight, nine, or ten is thrown, the 
player continues to roll the dice until he duplicates his first toss or until a seven occurs. 
The numbers two, three, eleven, twelve are disregarded after the first toss. If the 
player duplicates his first toss before rolling a seven, he wins; otherwise he loses. We 
compute the probability that the man rolling the dice will win. At gambling estab- 
lishments an opponent of the roller of the dice (except the establishment) does not win 
if the player rolls a twelve. The probability of winning on the first toss is 


p(7) + pl) = x 


since there are 6 ways of rolling a seven, 2 ways of rolling an eleven, and 36 total 
possible throws of two dice. The player can also win by rolling a four and then rolling 
a four before a seven. The probability of rolling a four is sg, and the probability of 
rolling a four before a seven is $. Thus the probability of rolling a four and then 
rolling another four before a seven is gz -% = au: The same reasoning for the num- 
bers five, six, eight, nine, ten yields 


p =r + 2s + os + os) = FH = 0.49292... 


PROBABILITY THEORY AND STATISTICS 349 


Example 9.11. Bernoullt Trials. Let p be the probability of success of a certain 
experiment, g = 1 — p the probability of its failure. Assume that the experiment is 
performed n times, the probability of success remaining constant, while the result of 
each experiment is independent of all the others. Such a sequence is called a Ber- 
noulli sequence. The simplest example is illustrated by the repeated toss of a coin. 
We determine now the probability of attaining exactly r successes in n trials. In the 
case of the coin problem a particular successful sequence would be the sequence 
A,H.+-+-+-H,T:T. - +--+ Tn, if we were interested in obtaining exactly r heads. 
There are obviously "C, different sequences containing exactly r heads and n — r tails. 
The probability of obtaining any particular sequence 1s (¢)". Thus the probability 
of obtaining exactly r heads in n tosses of a coin is given by "C,/2", Let the reader 
show that 


P, = *C, prt (9.14) 


is the probability of obtaining exactly r successes in n trials for the general Bernoulli 
sequence. It can be shown that the number r which makes P, a maximum for a 
given n and p is the greatest integer less 
than or equal to (n + 1)p. 


We now consider events which 
may not be mutually exclusive. Let 
E be an event space with subsets 
A, Ax, ..., An. The A, 7 = 1, 
2, ..., n, need not be mutually 
exclusive (see Fig. 9.2). 

Certainly 


p(A1) + p(A2) + p(As) 
> p(A,U AoU As) pee 





since there may be overlapping regions of A;, Ax, A3. Geometrically, it 
is easy to verify that 


A,U AeU As = Ay + Ap + Az — AYO Ap — Aol Az 
— As. Ay + Ay A.M Ay (9.15) 


In Ai + Ag+ Az we have counted A;(\ Ae twice; so we substract 
Ai(\ As, with a similar statement for 4; /\ A3, Ae(\ Az. In Ai + Age 
+ A; we have counted A, (\ A2f\ A; three times, and then we have 
subtracted A, /\ A2(\ A; three times in A;/\ Ag, etc. Thus we add 
Aif\ Ae) A; to obtain (9.15). In general 


p(U A.) = ¥ pa) - Y p(A.r A) 


a=] 74121 


+ Y (AO ASM Ad) bo + (1 (AA) 0.16) 


1,),k 


350 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Example 9.12. Balls numbered 1 to n are placed at random in n urns numbered 
1 to n, one ball to each urn. What is the probability that the number of at least one 
ball corresponds to the number of the urn in which it is placed? Let A, be the event 
of having the ith ball in the 2th urn regardless of the distribution of the rest of the balls. 
Then p(A,) = 1/n fori = 1, 2,...,n. p(A,f\A,) represents the probability 
that both the 7th and jth ball be in their proper urns. From (9.5), 


P(A’ A;) = p(A,)pa,(A,) 


Now pa,(A,) = 1/(n — 1), since, if it is known that the jth hall is in the proper 
urn, there is one chance in n — 1 that the ath ball will be in its proper urn 9 Thus 
p(A,(\ A,) = 1/n(n — 1). We note also that there are "Cy different 4,0) A,. 
Continuing, it is easy to see that 


1 n 1 nN 1 

Pa =n () — () n(n — 1) a G) nin — 1)(n — 2) mt 
1 1 ] 1 
HLS tS rt ee rae hap 


Let the reader show that lim P, = (e — 1)/e. 

We conclude this section by considering the simple one-dimensional 
random-walk problem. Assume that a person starts at the origin. A 
coin is tossed, with the result that if a head occurs the person will move 
one unit to the right and if a tail occurs the person will move one unit to the 
left. The process is repeated n times. What is the probability that the 
final position is located at z = r? Let u be the number of moves to 
the right and v the number of moves to the left that can occur so that 
the final position will be atz =r. Thenu—v=r,u+v =n, so that 
u=(r+n)/2, v = (n — r)/2. There are *C, different ways in which 
one can move to the right, the rest of the moves being to the left. The 
probability of ending at x =r is thus "Cr42)/2/2".. This problem is 
similar to that of Example 9.4. 


Problems 


1. What is the probability that a seven occurs at least once in n tosses of a pair of 
dice? Ans. 1 — (%)" 

2. Derive (9.14). 

3. Show that 


pl A, C) Craw, A3)] = p(Ai C) Az) + p(Ai C A3) a p( As C Acl\ As) 


What does this formula reduce to if Ai, Ao, and As; are mutually independent? 
4. Show that the probability of having at least 7 successes in a sequence of n Ber- 


n 
noulli trials is P = , (") pgr*. 


1=FT 


nr 
6. Show that , (7) pi(l — pyr = 1, 
+=() 


PROBABILITY THEORY AND STATISTICS 351 


6. A coin is tossed in a Bernoulli sequence. What is the probability of obtaining 
m heads before n tails appear? Hint: If there are at least m heads in the first m ++ 


n — 1 tosses of the coin, the game is won. 
m+n—1 


1 m+t+n-—-1 
Ans. Qmtn—1 > ( + - ) 
ram 
7. Generalize the random-walk problem to the case of two dimensions with p = { 
for each of the possible motions. 
8. A and B have, respectively, n + 1 and n coins. If they toss their coins simul- 
taneously, what is the probability that A will have more heads than B? Ans. p = % 
9. What is the probability of getting exactly 2 sixes in three throws of a die? 
. 4 
Ans. oe 
10. Coin 1 has a probability of p of getting a head, and coin 2 has a probability of ¢ 
of getting a tail. We start with coin 1 and keep tossing it until a tail occurs, where- 
upon we switch to coin 2. Whenever a tail occurs on coin 2, we switch to coin 1, etc. 
What is the probability that the nth toss will be performed on coin 1? Hint: Let Pa 
be the desired probability so that Py: = Pap + 1 — P.)q, Pi: = 1. Show that 


Payi — Pa = (p — g)(Pn — Pa-1) = (p — 1)(p — g)"? 

i it ep SD) Peay 
Ans P, = eens 
9.6. Continuous Probability and Distribution Functions. In previous 
discussions the probability event space consisted of a finite number of 
elementary events. A simple example illustrates the extension of this 
idea. We consider an idealized spinner whose pointer can assume any 
one of the directions 0 S x < 2z, x measured in radians. We say that 
there is zero probability that the direction of the pointer be less than zero 
or greater than 27 since we are concerned only with angles between zero 
and 27 radians. The direction of the pointer has been put into a corre- 
spondence with the real-number system. The set of all possible direc- 
tions is called a one-dimensional random, or stochastic, variable, usually 
denoted by & In this example it makes very little sense to ask the 
following question: What is the probability that a random spin yields 
the direction 00? 4 is just one possible event of an infinite number of 
different events which may occur. It does make sense to ask the follow- 
ing question: What is the probability that the random event £ be less than 

or equal to x? We define the a priori distribution function F(x) as 


F(z) = 0 forz <0 
F(x) = P(g $2) = 5 forO <a < 20 (9.17) 
F(x) = 1 for x = 2n 


Figure 9.3 shows a graphic representation of F(x). 
For any xz and y we note that (9.17) yields 


Pe SES y) = Fy) — F(z) 


352 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


For y = x + Arwe have P(x S § S$ x4+ Az) = F(x + Az) — F(x) = 
AF is the probability that & lies between x and x + Az. We also note 
that F(x) is a monotonic nondecreasing function of x with F(— ©) = 0, 
F(+0) = 

F(z) 





Fig 9.3 


For the more general case we consider a nonnegative function p(x) 
such that p(x) dx represents the probability that the random event & lic 
between x and x + dz, except for infinitesimals of higher order. We 
further desire 


i p(a) dx = 1 (9.18) 


p(x) is called the probability function of the random variable & The 
distribution function is simply 


F(x) = P(&Sx) = i p(x) dx (9.19) 
Note that F(x) is nondecreasing since p(x) is nonnegative, with 
R(—%) =0  F(#) = 
dF (x) = p(x) dx is the probability that & les between x a x + dx. 
A simple example of a probability function is p(x) = Bea ———— with 


TER 
F(x) if" iss = (tants +5) 


The nth moment of a distrjbution is defined by 


a, = P. xz"p(x) dx n=0,1,2,... (9.20) 
provided the integrals exist. For p(x) = ee a + ee: only do exists. From 
dF (x) = p(x) dx we can write 

a, = [" xdF(z) =0,1,2,... (9.21) 


Equation (9.21) is the Stieltjes integral of Chap. 7. The discrete case 
can be handled by use of this integral. Let x = 1 represent the occur- 
rence of a head and x = 2 the occurrence of a tail when a coin is tossed 


PROBABILITY THEORY AND STATISTICS 353 


with p(1) = 3, p(2) = $, p(x) = O otherwise. Then F(x) = Oforz < 1, 
F(x) = ¢forl S 2 < 2,F(a) = 1forz 2 2. Wecan consider that two 
point masses contribute to the probability function. From dF(z) = 0 
for 2 <1,dF =3atx=1,dF =O forl <2 <2, dF =%§ at x = 2, 
dF = 0 for x > 2, we note that 


0 forz <1 
x : forz = 1 
7 __ Sy 
F(a) = |" aF + =6 forl<2<2 
= ] forz = 2 


Geometrically (see Fig. 9.4), we note that F(x) is monotonic nondecreas- 
ing. Discontinuities in F(2) occur at x = 1,2 = 2, because point masses 


F(a) 





Fra 9 4 
are situated there. To compute a), we note that 
a, = [e rdaF(x) =1-$+2°3=2 
Had we chosen 7; = —1, t2 = 1, we would have obtained 
a, = (-)Ds + (I)s = 0 
Note that $ is the mean of x = 1 and z = 2. 


Example 9.13. The Gaussian Distribution. Let us assume that a dart is thrown at 
the xy plane under the following assumptions: 

I. If p(z, y) dy dz is the probability that the dart fall in the area bounded by z and 
x+dz, y and y + dy, then p(z, y) depends only on the distance of (2, y) from the 
ofigin and is independent of 6 = tan~ (y/x). Thus 


p(x, y) dydx = q(r?)r dr dé = q(x? + y?) dy dx 
plz, y) = qa? + y’) 


We assume further that p(z, y) is differentiable. 
II. p(w, y) > 0 as x or y becomes infinite. 


III. ie i p(x, y) dydx = 1. 


We attempt to determine p(z, y). First we note that 


ce p(x, y) ay | dx | { p(x, y) ae | ay = p(x, y) dy dz 


354 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
oo 

since Pi(x S&t§Su+dzr,—-x“ <y< =| f p(x, y) dy | az, 
— 0 


P(—-2 <2< ~,y Sy Sy + dy) =| f° p(z, ») dx | dy 
and Paesitsga2a+dz,y SS y+ dy) = PiPs = pla, y) dy dz 


Define R(z) = ie p(a, y) dy, Sty) = i p(x, y) dz, so that 








R@)SYy) = q(x? + 9?) (9.22) 
Differentiating (9.22) yields 
R(z) =P = yg (at + v9) 
SY) oe) = 2rq’(x? + y?) 
and aa) ae = SRS} ante) = constant = a (9.23) 
From (9.23), R(z) = Ae*’, S(y) = Bev’, and 
p(x, y) = q(x? + y?) = Ceate*ty? (9.24) 
Condition II implies that a is negative If we choose a = —1/20?, of necessity, 
C = 1/270? from condition III. Thus 
pa, y) = gary meme (9 25) 


aro? 


Equation (9.25) is the normal! distribution of Gauss For the one-dimensional case 


g(r) = = ew (1/20 2)(z~ m)2 (9.26) 
Co Tv 





where m represents a displacement from the orgin Let the reader show that 
c= | (x — m)*y(x) dx 
Thus o? is the second moment of ¢(z) relative to the center m. 
Example 9.14. Tchebyscheff’s Theorem. JY.et — be a random variable with a proba- 
bility function p(z). Let g(£) be a nonnegative function of &, and let S be the set of 


points such that g(£) 2 K > 0. S will denote the rest of the real axis. The expected 
or mean value of g(£) is defined as 


Big(®) = [", g@)p(e) ax 


The probability that g(£) be greater than or equal to K when ¢ is chosen at random is 
simply the probability that & be found in S, so that P[g(g) = K] = I, p(x) dz. Now 


Big()| = [o(2yp(e) de +f o(2)p(2) ae 


= I, g(x)p(z) dt 2K [,7@ dx 


PROBABILITY THEORY AND STATISTICS 359 


since g(x) and p(x) are nonnegative. We obtain Tchebyscheff’s theorem, 


E 
Pig() 2 K) = =!) (9.27) 
What difficulties arise if g(r) = 1 for xr irrational, g(r) = 0 for x rational, K = 4? 
If m is the first moment or mean of — and a? is the variance of & defined by 


c= / (x — m)*p(r) dr 
let the reader deduce the Bienaymé-Tchebyscheff inequality, 


P(t — m| = ko) S sa (9.28) 
if we choose g(&) = (£ — m)?, A = k%o?. 

Example 9.15. The Buffon Needle Problem. A board 1s ruled by equidistant 
parallel lines, two consecutive lines being d units apart. A needle of length a < dis 
thrown at random on the board Whats the probability that the needle will intersect 
one of the lines? Let x and y describe the position of the needle (see Fig. 9.5). 


d x ly. d 
-jt--4 
a iy | | | 
Fra 9.5 


If y S asin gz, our situation 1s favorable, while if d > y > asin sr, an unfavorable 
case occurs We assume that z and y are independent random variables Thus 


dy 
d 


d 
p(t S'S2+dz)=— poly SnSy + dy) 


and p(z, y) dydzr = (1/rd) dydz. We are interested in P(O 
asin x), which 1s defined by 


x fasine du d ] W ae re 2a 
ine p(z, y) dyde = [, dy dx = —, 


‘Hzample 9.16. Let & be a random variable with a probability function p(x). The 
probability that & lies between z, and xz; + dz; as the result of a single experiment is 
p(x1) dx;. If we repeat the experiment, the probability that & les between z2 and 
xe + dxz2 is p(x2) dz2. The joint probability that both results happen as the result of 
two experiments (assumed independent) is 


IIA 


cin, OSYS 


p(11)p(x2) dz dx. (9.29) 


Equation (9.29) is a special case of the single random variable (£, 7) with probability 
function p(z, y). We note that 


ie I. p(x1)p(x2) dx dx» = ] 


356 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


We now ask the following question: What is the probability that, as the result of two 
experiments, 21 + x2 2 K, K a constant? In Fig. 9.6 we note that this is Just the 


vo 






= 


Uy 


to=K- vy 
Fia. 96 


probability that the point (21, 22) lie above the line 7; + 22. = A, so that 


P(t + a 2 K) = i. ie p(ai)p(ae) dis dx; (9.30) 
From (9.30) we have 


PG hay SOR AUK) = L. te np desde (9.31) 


. : 0 A+adk —727) 
so that P(K Sa+22.5 A4+dkK) = [ ib p(ri)p(r2) dredr; (9 32) 


AK+dk —21 7 : 
by subtracting (9.31) from (9.30) Since i p(to) dry = p(K — 21) dk, we 
— 1 


note that the probability function for 1; + x2 = wis 


p(u) = Vs piri)p(u — 21) dr (9.33) 


Problems 


1. Show that ¢(z) of (9.26) satisfies pe g(2) = 1. 


2. Derive (9.28). 
8. Find the four-dimensional Joint probability function for the two-needle Buffon 
problem. 


4. Show that o? = [C (x — m)*p(x) dr = az — m?, ae defined by (9.20). 

5. Let £ and 7» be random variables with Gaussian probability functions (m = 0). 
Show that wu = £ + 7 has a Gaussian distribution. 

6. Find a1, a2, asfor v(x) of (9.26). See (9.20) for the definition of a,,7 = 1,2,3, ... 


7. If €is a random variable with a Gaussian distribution (m = 0), find the proba- 
bility function for the random variable £2. Hint: Let u = & so that 


Puzt) =P(# 21) =P zvi) t20 
1 [- —z2 3 1 (ie an 
Sep e772/202 dar + ——— e7 22/202 dx 
V/ Ue o V/t V2ra J —@ 


Pu 2t+dt) = P(# =t+dt) = P(t = Vt + dt) ~ P(e BS pct 


2Vt 


PROBABILITY THEORY AND STATISTICS 357 


Then show that P(t Su St + dt) = (1/o V 2rt) e~*/2°? dt, so that 





p(t) = a e'/20? fort > O p(t) = Ofort $0 


o at 


8. Let £ be a random variable with probability function p(x). Let u = u(¢) be a 
strictly increasing (or decreasing) function of & Show that the probability function 


for the random variable wu is q(u) = p(x(u)) | ae » where x(u) 18 the inverse function 


u 





of u(x). Apply this result to Prob. 7. 

9. Consider a distribution of points in the plane with a density function given by 
(9.25). At ¢ = 0 each point moves in a direction ¢ with speed V and with a proba- 
bility function p(¢) = ¢/27,0 S ¢ < 2x7. Show that at time ¢ the probability func- 
tion in space and direction of motion 1s given by 


e7 (1/202) [724 V 22—2rVt con (6-¢)] 





1 
pr, 6, Y) t) = 4ar2g2 


5 
Show that p(r, 6, ¢, ¢) satisfies - +V- (pv) = 0 with 
v = Vcos gi+ V sin ¢j = V cos (¢ — Oe, + V sin (¢ — 6)eg 


9.7. The Characteristic Function. Bernoulli’s Theorem. The Cen- 
tral-limit Theorem. Let & be a random variahle with a probability 
function p(x). Let us assume that the Fourier integral of p(x) exists. 
Then 


y(t) = - ott (x) dx (9.34) 


is called the characteristic function of p(x), with ¢ real. We note that 


g(0) = [-. p(x) dx = 1. Fort real, n a positive integer, we have 


if i" xnetep (x) da = I _ x°’"p(x) dx = Gon 
If we can differentiate inside the integral, then 


ap 


a = 4" | x"p(x) dx = 1°An (9.35) 


t=0 aa 





Hence, if we can find g(¢), it will be possible to find the nth moment 
from (9.35). 


Example 9.17. Let p(x) = e* for x 2 0, p(x) = O otherwise. Then 








y(t) = [ * ee") de = >A 
dg = 1 =; 
Giieo a eo 
d*p as ete = —9 
di? |\:=0 (1 — 2t)? ig 








358 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


so that a; = 1, a2 = 2. Since 


te] 





1 i” . : 
= (n) — = | 242 ; : nyn ; 
fy- } me ar+ieee do tort 
n=O 
we have 9™(0) = 2"n/ and a, = n!. This result is also obvious since 


T'in+1) = [ reerdr=ni 
Example 9.18. Bernoullr’s Theorem. For a Bernoulli sequence of n events we have 
n 
P= (“)era = py (9.36) 


for the probabuity of obtaining exactly r successes [sce (9.14)] 
The analogue of (9.34) for the discrete case 1s 


I 


n 
g(t) > eed, 
r=(0 


| © onte dF (2) (9.37) 


Applying (9 36) yields 





n 
n 
g(t) = » 2 (pe?*)r(1 es p)” r 
r=() 
= [pe + (1 ~ pI" 
Thus ¢(0) = 1, ¢’(0) = ipn, ¢'’(0) = —n(n — 1)p? — np Hence a; = m = pn, 
a2 = n(n — 1)p? + np, so that o? = a, — m? = np(1 — p) (see Prob 4, Sec. 9.6). 
Applying (9.28) with k = \\— ee actos 0, yields 
pplying (9.28) ONG dea) y 
p(l — p) I ‘ 
ss < ae 
a a en ~~ fe2n 722) 


since p(l — p) S t forO0 Sp $1. Formula (9.38) 1s equivalent to 


- 


A 


| 
p( a p > :) < iin (9.39) 
Formula (9.39) 1s a form of Bernoulh’s theorem In other words, the probability that 
the frequence of occurrence, ~/n, differs from its mean value p by a quantity of 
absolute value at least equal to e tends to zero as 7 — © however small ¢ is chosen. 
This essentially means that the greater n is the more certain we are that ¢/n differs 
little from p, where é is the total number of successes in n trials. 


Let us return to the characteristic function. From (9.34) we note that, 
if g(t) is known, then 


p(x) = = e—"ro(t) dt (9.40) 


— 0 


PROBABILITY THEORY AND STATISTICS 359 


Hence p(z) is uniquely determined if the characteristic function is known. 
Equation (9.40) follows from the Fourier integral theorem. If — and 9 
are independent stochastic variables with probability functions p,(z), 
p(y), respectively, it follows from the method of Example 9.16 that 


p(u) = [-. pi(x)pe(u — x) dx (9.41) 


is the probability function for the stochastic variable & + 7. The char- 
acteristic function of p(w) is 


e() = [7 emp(u) du 


i iz i. etn (2) peu — x) dx du (9.42) 
Now gi(t) = i etn (x) dx 
g2(t) = [-. etpay) dy 
so that ——ai(t)ea(t) = [f° et™pa(e)paly) dy dex 
= [7 [2 em me)pau — 2) du dx (9.43) 


by letting u = 2 + y. A comparison of (9.42) with (9.43) yields 
g(t) = gilt)ge(t) (9.44) 


provided the order of integration can be interchanged. 

Another interesting result is the following: If gi(¢), go(t), ... , 
¢n(t), . . . 18 a Sequence of characteristic functions which converges to 
y(t), abtaiied from a sequence of probability functions Dil(x), po(x), 
Dalz), . . . , it may be possible that 


, 


eo 


ie mf _eteg(t) at (9.45) 


where g(t) and p(x) are limits of their respective sequences. In order 
that (9.45) hold, we must have 


plz) = lim pala) = tim Lf etrpa(ty dt = Lf emepto at 
(9.45) 


The reader is referred to Sec. 10.22 for a discussion of this type of problem. 

Let us consider now the following example, which will illustrate one 
aspect of the central-limit theorem involving probabilities. Let & be a 
random variable, and, to be momentarily specific, we shall consider the 
toss of a coin. The random variable & will have the value x = 1 if a 
head occurs and the value x = —1 if a tail occurs. If we toss the coin 


360 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


five times, we can record the numbers (x, Xe, 23, X4, 25) with x; equal to 
lor —1,7 = 1, 2,3, 4,5. The set (x1, re, rz, v4, 25) represents a point 
in a five-dimensional space. We can repeat the experiment of tossing 
five coins as often as we please and obtain a set of 5-tuples, (x3, x3, 2%, 
zi, zt),2 = 1,2, .... Wemay look upon the aggregate 2}, x?, x3, . . . 
as a set of results yielding information about the random variable & The 
same statement can be made concerning the set 7}, 72,73, . . . , aR... , 
etc. The set of 5-tuples can also be looked upon as defining a new random 
variable. For the more general case we let p(x) be the probability func- 
tion for the random variable £, and we consider the n-tuple of independent 
events (11, 22, . . .,2n). The probability that a point lie in the volume 
bounded by x, and 2, + dx, x2 and x2 + dxo, x, and x, + dz, is given by 


P(L1)p(%2) - + + Plan) dx: dz. +--+ dx, 


If S, is the random variable > x,, an extension of Example 9.16 shows 


=] 


that the probability function for S, 1s given by 


ee ¢ = S =) 


4=] 
—l 


= fe ee p(wi1)p(ae) °° * p(w i 1) din. °° * dt, (9.46) 


— © 
10] 


The characteristic function is 


,(t) = i P, (ue du 
= fe tone i elartret -tzdtn(ys) + + > p(tn) da, °°: dz, (9.47) 


by letting wu = 2, +a42+ °-- +> + 2n, du = dz,, and reversing the order 
of integration. Equation (9.47) becomes 


®,(¢) = [¢e(d)]” (9.48) 


with g(t) = i_ p(x)e** dx. Let us assume that the first moment of 


p(x) is zero so that o? is its second moment. Thus ¢(0) = 1, ¢’(0) = 0, 
yg (0) = —o*. If g(t) has a continuous third derivative in a neighbor- 


hood of ¢ = 0, then 
t? 
et) =1— os + a(t) 


with |a(t)| < At (see Sec. 10.23). 


PROBABILITY THEORY AND STATISTICS 361 


If 7, is the random variable S,/c ~/n, let the reader show that the 
characteristic function for T, is ¥,(t) = ®n(t/o ~/n). Hence 


Ya(t) = |°(- -)| 


~poaitsGya)l 


with |a(t/o ~/n)| < At?/o3n!. From In (1 + z) = 2 + B(z) with |8(z)| < 
Bz? for z small, one has that 





. 1? _ #B 
lim Iny,(t) = hm nin E Ss el el (9.49) 


n> © n— © 2n 
From (9 49) it follows that 


lim y,(t) = e-?? 
The probability function P(x) associated with the characteristic function 
eo iS 
P(t) =- [> ePPee dt = Fa en (9.50) 
on — © \/ 20 " 
If we accept (9.45’), we have shown that if £), &, ..., &,... are 
random variables with the same probability function p(x), with a mean 


n 


equal to zero and second moment equal to co, then 7, = > t,/o /n 
1=1 


is a random variable whose probability function approaches the normal 


distribution (1/+/27)e-7”" as n becomes infinite. We say that the 
sequence of random variables obeys the central-limit law. 


Problems 


-_ 


. Derive (9.49). 
. Derive (9.50). 
3. Show that the characteristic function for Cauchy’s distribution function, 


to 


—|tl 


] ] : 
p(x) = Tig g(t) =e 


4. Let p(x) = 1 for —% srs ¥ p(x) = 0 otherwise. Show that 


2 2 dl 
g(t) = 7 sing 


6. If is a random variable with characteristic function g(t), show that e~“«y(t) is 
the characteristic function for the random variable — — a, a = constant. 

6. Consider a sequence of 1,000 Bernoulli trials with p = 3. Show that the proba- 
bility that the experimental ratio r/n will differ from 3 by less than 0.01 is greater than 
or equal to 4. 


362 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


9.8. The x’ Distribution. Application to Statistics. From the defini- 
tion of the gamma function (see Sec. 4.16) we have 


I'(z) = | etx?! dx 
0 


at wf a ae dy a> 0 
0 


x a —ay ),;z—1 — 
so that [ Ti) e—wy?-l dy = | a>0O (9.51) 
The function 
° oo3 a ~—AY7 ;2—1 
f(y; 4, z) = TG) en wy fory > 0 
= 0 fory <0 (9.52) 


is nonnegative for a > 0,z > 0, and fr f(y; a,2) dy = 1. We can look 


upon f(y; a, z) as a probability function. Its characteristic function is 


g(t) = i cys a, z) dy 
a Te) : ea sah eo dy 


a we ° —V),z— 1 j= a — peddle ae 
= Te) I ey?! di eae a = we: (9.53) 


Now let & be a random variable with probability function 


(.r) ec i eo. 
: (ya) 


The probability function for & 1s P(x) = (1/+/2mx) e-*/? for x > 0, 
P(x) = 0 otherwise (see Prob. 7, Sec. 9.6). If &1, 2, . . . , &,are n inde- 
pendent random variables with the same probability function p(x) given 
above, the characteristic function for the random variable 


rh 


~= VB (9.54) 


Db 


is the product of the characteristic functions for each random variable £2, 
+= 1,2, ... ,n[see (9.44)]. Now the characteristic function for P(z) is 


g(t) = ——— edz = -—-_- ax 
| a/ 2nx 0 VW 2x4rx 


] 1 “sf eS 
= ..(— — 4 e~’ dv = (1 — 2Qit)-3 
vala-*) J. oan 








PROBABILITY THEORY AND STATISTICS 363 
The characteristic function for x? is 
@(t) = (1 — Q2t)-"”? (9.55) 


Comparing (9.55) with (9.53), we note that z = n/2, a = 4, so that the 
probability function associated with the random variable x? is f(y; 5, 2/2), 


ln | ; ; 
katy) = (a oy 4 oo IPL (n/2) yl Cry: fory > 0 
= () fory SO (9.56) 


The distribution function for k,(y) is 
2 | es ; 
K,(r) = Po? Sz) = 2°P(n /2) I pee ay (9.57) 


The distribution defined by K,,(z) is called the x? distribution, principally 
associated with K. Pearson. 

The x? test of significance arises in the following manner: Let & be a 
random variable with a known probability function p(x). Let us divide 
the interval —«° <a < o into m parts, say, -e <r<cw,rmS2rS8 
ea, . 0 yan SX< ©. We have 


Pia) StS 24,) = [° p(x) dx 


m 
with » P,=1. Let N be the number of times we sample &, the result 
1=1 
of any sample being independent of the previous samples. The expected 
number of samples which fall in the 7th interval is given by NP,. If 7, is 
the actual number of samples falling in the 7th interval, then 


m 


» (r, — NP,)? 


a=] 


certainly constitutes some measure of discrepancy between the theoretical 
and experimental results. K. Pearson found that 


y (r, — NP)? 
De os pest Jee eae 
1=] 


yielded a practical means for measuring the reliability of the experi- 
m 


mental results. If we let y, = (7, — NP.)/*~/NP.,, then x? = > y?, and 


a=) 


364 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


yuvP= Tey no VN) P= ./N —~/N =0. Hence the 
a=] t=] 1=1] 


random variables y, are not independent. The y, lie in the hyperplane 


™m™ 
y, VP, = 0, a subspace of the m-dimensional Euclidean space. It 
i=] 
can be shown that, as N becomes infinite (NV is the total number of 
samples or trials), then the distribution of x? approaches K,_:(x) [see 
(9.57)|. The reader is referred to Cramér, ‘‘Mathematical Methods of 
Statistics,’’ Chap. 30, for proof of this statement. Thus 
lim P(x? 2 xj) = ee ee y m-3)/2e—u/2 dy (9.59) 
Nene (m—1)/2 TS @ a, 
to 2 
One can determine the value of the integral of (9.59) from a table of the 
x? distribution. It is important to note that (9.59) is independent of 
the original probability function p(x). 


Example 9.19. A coin was tossed 5,000 times and heads appeared 2,512 times. 
Under the assumption that the coin is ‘‘true,’’ we have p(H) = , p(T) = %, m = 2, 
N = 5,000, Pi = P2 = 33171 = 2,512, ro = 2,488. Hence 


» _. (2,512 — 2,500)? | (2,488 — 2,500)? 


2,500 2,500 
= 0.115 








In a table of the x? distribution it is found that the probability that x? exceed 3.841 
is 0.05. Since 0.115 < 3.841, we feel that there is no inconsistency in the assumption 
that the coin 1s true. The 5 per cent level is taken as a fairly significant level. In 
the tables one also finds that P(x? 2 0.115) is about 0.73, which means that we have 
a probability of about 73 per cent of obtaining a deviation from the expected result at 
least as great as that actually observed. We are therefore not too worried about 
the fact that experimentally we obtained x? = 0.115. 


Problems 


1. A coin is tossed 5,000 times, with heads occurring 3,000 times. Evaluate x? for 
this experiment. Would you be suspicious that the coin is ‘‘true’’? 

2. A die is rolled 6,000 times with the following occurrences: the number of times 1, 
2, 3, 4, 5, 6 occurred, respectively, was 1,020; 1,032; 981; 977; 1,011; 979. Show that 
x? = 2.876. What is your opinion as to the “‘truthness”’ of the die? 

8. A random variable ~ has the probability function p(z) = (1/+/2q7)e"*”2.. If in 
1,000 experiments one obtained 534 values of z < 0 and 466 values of x 2 0, would 
you consider that the experiment was biased? 


9.9. Monte Carlo Methods and the Theory of Games. The method of 
Monte Carlo is essentially a device for making use of probability theory 


PROBABILITY THEORY AND STATISTICS 365 


to approximate the solution of a mathematical or physical problem. A 
few examples will illustrate the method. 
Let p(x) be the probability function of a random variable, ¢, defined 


on the range a S x S 6B, with l : p(x) dx = 1, p(x) = 0 for x outside the 


interval (a, b). The expected, or mean, value of f(x) has been defined as 


P= [?F@)pla) de 


A sequence of values 21, 22, . . . , Xn, is chosen at random subject to the 
condition that the probability that the random variable é lie between x 
and x + dz be given by p(x) dx. For n large one hopes that 


b n 
[ a)p(a) dx = = : f(a) (9.60) 
se 1=1 


b 
If we wish to find an approximate value of 1 F(x) dx, we note that 


a 


b b (x) 
[ F(x) dx = [ a) p(x) dx (9.61) 


so that f(z) = F(x)/p(x). A simple choice for p(x) is p(x) = 1/(b — a). 
The choice p(x) = 1/(6 — a) implies that one can construct an experi- 
ment such that all numbers on the interval (a, b) have equal likelihood of 
occurrence. It is extremely doubtful that such an experiment can be 
devised. The toss of a single coin can be used to generate the number 
zero if a head occurs, the number 1 if a tail occurs. An infinite number 
of ordered tosses of a single coin, or an infinite number of ordered coins 
tossed simultaneously, would yield a sequence do, @1, @2, . . . ,Qn,... 
with a, = 0 or 1 for alln 2 0. This, in turn, yields the number 


3 


30 


An 


n=O 


with O<x<1. Mathematicians have constructed tables of random 
numbers to avoid the cumbersome process of obtaining a random vari- 
able by experiment. Let the reader deduce a method for obtaining 


N 
& = » : by use of a bowl, containing an equal number of red and white 
n=O 


balls, and a scoop with N + 1 ordered holes. The author chose 25 num- 


366 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
25 

bers from a table of random numbers and obtained zs > xz, = 0.5030, 
1=1 

reasonably close to i ' ¢ dt. 


We now consider a less trivial example. The game to be played con- 
cerns three spinners (see Fig. 9.7). If a pointer comes to rest in the area 


Fig 97 


Ill 


designated by p,, we move from the 7th spinner to the jth spinner. If 
the pointer comes to rest in the area z,, 7 = 1, 2, 3, the game is over. 


3 
The areas are to be unity so that x, = 1 — > Py, t = 1, 2, 3, and the 
j3=1 

p’s and x’s represent probabilities. We ask the following question: If 
we begin the game at spinner I, what is the probability, f1;, that the 
game will end at spinner I? We can be successful as follows: 

1. x; occurs on the first spin. 

2. p11 occurs n times in succession, and then x; occurs, n = 1,2,3,.... 

3. Dio occurs followed by the probability, foi, that if we are at spinner II 
the game will end at spinner I. 

4. Pi, occurs n times in succession, n 2 1, and then (3) occurs. 

5. Replace pio and fo; of (3) and (4) by pi3 and fs. 

Let the reader show that — 


fir = (x1 + Pity +e D3 ,%1 qo >) + (piefer + Pirpiefe1 
+ Pi Prefer ae *) + (pisfsi fe Piprsf si “ Dirprsf si 22 ‘) 
Li + Pirfer + Dis} 31 


i—- P11 
By the reasoning used above one can readily show that 
(1 aaa Du) Sri = 1X, 7 > Dik Ske 1 = 1, 2, 3 
kAi 
. (9.63) 
(1 — pais = ) Pah j#i=1,2,3 


kx 


PROBABILITY THEORY AND STATISTICS 367 


From (9.63) one obtains 


(pi — 1)fie + Piofe: + Pisfa. = O 
Darfie + (Pee aT 1) fee + Dosf se eee 
Partie =e Do2fo2 > 4 Dag? s= 1) fse = 0) 


If we solve for fi2, we have 








0 Pi2 P13 

Lo po. — | Pes Pie Pi | 

Le 
f () P32 p33 — | ie Ps. Pas — | 
12 Tg ee ree re ee eg 
Pu — ! Pi2 P13 |A| 
P21 p22 — | P23 
| psi P32. + P3s — | 





so that —fi2/22 is the inverse element of pe; in the matrix 


Pi — | P12 P13 
|All] = P21 Po — | P23 
P31 P32 Das — 1 


(see Sec. 1.3). Let the reader show that —f22/x2 is the inverse element 
of pee for ||Al|, etc. ‘hus probability theory can be applied to approxi- 
mate the elements of the inverse matrix of a given matrix. The quanti- 
ties f,,, 2,7 = 1, 2, 3, are obtained by experiment. 

An analogy exists between the diffusion process of heat motion and the 
two-dimensional random-walk problem. Suppose that a particle located 
at (a, y) moves to one of the four positions (zc + 1, y), (« — 1, y), (2, 
y +1), (2, y — 1), with equal probability, p = %. What is the proba- 
bility P(a, y; ¢) that a particle will arrive at (x, y) after ¢ steps if it starts 
from the origin (0, 0)? For a particle to arrive at (a, y) in t steps, it is 
obvious that it will have had to arrive at one of the four positions (z, 
y— 1), (a4, y +1), @ -—1, y), (@ + 1, y) nt — 1 steps. Thus 


P(z,y3t) = aP@,y —1t¢-1I4+P@,y+1t-) 

: + Pia —l,y;t—1)+P(r+1,y;t-—1)] (9.64) 
Let us subtract P(x, y; ¢ — 1) from both sides of (9.64). Then 
P(x, y3t) — P(z, y;t — 1) 


+ 4[P(a,y +1;¢—- 1) — 2P(z,y;t-1) + P(z,y — 1;¢-1)] (9.65) 


In Sec. 10.13 one notes that, if f(x) is continuous for a S x S b, then 


f(x + 2h) — 2f(x + h) + f(z) 
0 h? 


f(x) = lim (9.66) 


368 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


[see (10.22)]. In the calculus of finite differences the expression [f(z + 2) 
— 2f(z + 1) + f(x)] represents the second difference of f(z). Thus 


Af(x) = fie + 1) — f(z) 
Af(z + 1) = f(@ + 2) — fe + 1) (9.67) 
A*f(x) = Af(a + 1) — Af(z) = f(x + 2) — 2f(z + 1) + f(z) 


Thus (9.65) may be written 





A,P = <(A2P + A?P) (9.68) 
which may be compared with the diffusion equation 

aP aP ap 

ap = k (e 5 ) (9.69) 


A very fine subdivision of space and time must be used, however, in order 
to freely interchange (9.68) and (9.69). One traces the histories of a 
large number of particles with initial lattice distributions to obtain the 
distributions after a given number of steps. These histories yield an 
approximate solution to (9.69). 

We conclude this section with a brief discussion of the theory of games. 
A simple example illustrates the basic factors encountered in the theory 
of games. We consider a two-person zero-sum game as exemplified by 
the square array given by (9.70). 





] —6 2 (9.70) 


X and Y choose a row and column, respectively, and the choice of each is 
made without any express knowledge of the other’s choice. The number 
in the ith row and jth column (the choices of X and Y, respectively) 
designates the number of units that Y must pay X. If a, <0, then 
X pays Y an amount |a,|. The total sum earned by X and YF is zero, 
and it is in this sense that we have a two-person zero-sum game. We 
assume that X will attempt to earn as much as possible and that Y will 
attempt to hold his losses toa minimum. We see that in this particular 
game the possible courses of action of X and Y, the consequences of these 
actions, and the objectives of X and Y are fully known. The theory of 
games seeks to analyze objectively the conflict of X and Y and, moreover, 
attempts to determine an optimal course of action for each player. 


PROBABILITY THEORY AND STATISTICS 369 


The game given by (9.70) is very easily analyzed. If X chooses row 1, 
the least he can gain is —6. If he chooses row 2, the least he can gain is 
4 units. Since 4 > —6, it is obvious that X will always gain at least 
4 units by choosing row 2, no matter what the choice of Y. Since Y 
knows that X will always choose row 2, of necessity, Y will always choose 
column 2 to hold his losses to a minimum. Since X and Y will always 
choose row 2 and column 2, respectively, with probability 1, we say that 
X and Y use pure strategies. 

Let us note the following: Let f(x, y) denote the quantity in row x and 
column y. The smallest number in row « is designated by min f(z, y). 

y 


The largest of these numbers is written 


U = max min f(z, y) (9.71) 
x y 


Thus U = min f(a, y) 2 min f(a, y) for all z. Let the reader deduce 


that the choice of 2 = Xo by X guarantees that X will always earn at 

least U units. Let us determine the maximum amount Y can possibly 

lose if he plays wisely whatever the action of X. The largest number in 

row y is designated by max f(z, y). The smallest of all such numbers 
zr 


obtained by varying y is 
V = min max f(z, y) (9.72) 
y x 


Thus V = max f(z, yo) S max f(z, y) for all y. Let the reader deduce 


that the choice of y = yo will enable Y to hold his loss to an amount at 
most V. For the square array given by (9.70) we have 


U=V = f(2,2) =4 


and it is because of this that X always chooses the second row and 
Y always chooses the second column. Let us note that the element 
f(2, 2) = 41s the minimum element of the second row and the maximum 
element of the second column. It is for this reason that f(2, 2) is called 
a saddle element. Let the reader deduce that the existence of a saddle 
element implies that U = V. In any case one always has U S$ V. The 
proof proceeds as follows: Let g(x) = min f(z, y), U = g(xo) = max g(z), 
zw 


y 
h(y) = max f(a, y), V = h(yo) = min A(y). Then 
x Y 


U 


g(xo) = max g(x) = max [min f(x, y)] = min f(xo, y) 


S f(xo, yo) S max f(x, yo) = h(yo) = V_ (9.738) 


The reader is urged to verify every step of (9.73). 


370 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


We now consider a case for which U < JV, given by (9.74). 


. : 

x ae 
1 | 8 2 (9.74) 
2 =] 4 





Let the reader note the nonexistence of a saddle element with 
(re 2 <7 = 3 


Thus X is sure to win 2 units per game if he always chooses x = 1, while 
Y can never lose more than 3 units if he always chooses y = 1. If Y 
were to notice, however, that X always chose x = 1, then it would be 
expedient for Y to choose y = 2. Can it pay for X to change his strategy 
so that he can realize more than 2 units per game on the average? The 
answer is ‘‘yes’’! Let us suppose that Y and Y use mixed strategies in 
the following sense: X chooses x = 1 with probability p and chooses x = 2 
with probability 1 — p; Y chooses y = 1 with probability q and chooses 
y = 2 with probability 1 — g. The total expectation of X is 


KE = 3pq + 2p(] — g) + (—-1)0 — p)g + 401 — p) — 9) 
= q(6p — 5) + 4 — 2p (9.75) 
X now reasons as follows: What mixed strategy should I use in order 
to guarantee a given expectation no matter what the mixed strategy 
employed by Y? If 69 —5 > 0 or p> @, then FE is a minimum for 
q = 0, so that HE = 4 — 2p < §. On the other hand, if p < 3, Hisa 
minimum for g = lsothat H = 4p —1< 4. For6p —5 =Oorp = 
we have E = g independent of g. Thus, no matter what the strategy 
of Y, X can be assured of earning 3 of a unit per game by mixing his 
strategy. X chooses row 2 if a six occurs on a die and otherwise chooses 
row 1. Let the reader show that Y can choose a mixed strategy such 
that he will never lose more than 4 of a unit per game no matter what 
the strategy of YX. 
Let us now consider the following game exemplified by (9.76), with 
a>0O,6> 0: 


(9.76) 





PROBABILITY THEORY AND STATISTICS 371 


Let pi, pe, 1 — (pi + pe) be the probabilities associated with the choices 
of x = 1, 2, 3, respectively, and let q:, go, 1 — (gi + ge) be the proba- 
bilities associated with the choices of y = 1, 2, 3, respectively. The 
expectation of X is 


E = pi(qi — 92) + po(—qi + Qe) 
+ {1 — (pi + pe) |[a(qi + gz) — 6(1 — (qi + @2))] 
= qlpi — po + (a + 6)(1 — pi — pr)] 
+ ql[—pi + pe t+ (a t+ 6) (1 — pi — po)| — 611 — pi — pe) (9.77) 


Let us examine FE of (9.77). For p; = po = 3 we have EF = O inde- 
pendent of gi: and gz. Thus a mixed strategy occurs for X such that 
X cannot lose or gain. Y has a good strategy (pure) with gq; = gz = 0, 
go =1—q— qe =1. Why is it obvious that the only good strategy 
for X is the mixed strategy discussed above? However, we can show 
that good strategies exist for Y other than q: = ge = 0,q3 = 1. Wecan 
write 


EK = pi{(l -—a@— 6)¢. — Id tat 8)q2 + §] 
+ pil—-(1 + a@ + 4)gi + (1 — a — 8)go + 6] + (a + 6)(Q1 + Ge) — 6 


It is obvious that E will be independent of p,; and po if the coefficients 
of p,; and pe are zero. The solutions of these two equations yield 
G1 = Ge = 6/2(a + 6), which in turn yields HE = 0 so that Y can suffer 
no loss for this choice of g; and q.. Thus a good strategy which is mixed 
exists for Y. Let the reader show that, if q: = g2 < 6/2(a + 54), then 
E = (1 — pi — p2)([2(a@ + 5)gi — 6} S$ O for all choices of p, and po, so 
that an infinite number of good mixed strategies exist for Y. 


Problems 


1. Generalize the results of (9.63) for the case of n spinners. 

2. Let f(z) be a continuous monotonic increasing function defined for 0 S z S 1. 
If a random variable ~ be chosen from (0, 1), show that the probability that f(é) S yo, 
S(O) S yo S f(1) is given by p = f7!(yo), so that f(p) = yo. Use this method to find 
an approximate value of +/2 by use of a table of random numbers. 

3. Let us consider the plane of a two-dimensional random-walk problem with 
baundary points at (m, y.),2 = 1,2, ...,7r. To each point (x, y.) we associate a 
value U(a,, y.). Let V(z, y) be the expected value of a particle whose motion starts 
at (2, y) and eventually ends at one of the boundary points. Show that 


Vie, y) = > U(x, ys) Pi 


1=1 


with P, the probability that the walk terminates at (x, y,) if it begins at (z, y). Also 
show that 
Va, vy) =aV@+1,y) +V@-1y+Veay+D+V@,y—- 1) 
eV  a2V 


and compare this result with 5a? + ay = 0. 


372 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


4. Show that the game of ‘paper, rock, and scissor”’ is exemplified by the matrix 
below. What strategies should be used by X and Y? 





NY) Po oI R S 
x. | 1 | 2 3 
P 0 i) a 
1 
R = 0 1 
2 
s bh a 0 
3 
REFERENCES 


Cramer, H.: ‘‘Mathematical Methods of Statistics,” Princeton University Press, 
Princeton, N.J , 1946. 

Doob, J. L.: ‘Stochastic Processes,’ John Wiley & Sons, Inc , New York, 1953. 

Feller, W.: “Introduction to Probability Theory,” John Wilev & Sons, Inc, New 
York, 1950. 

Mood, A. M.: “Introduction to the Theory of Statistics,”” McGraw-Hill Book Com- 
pany, Inc., New York, 1950. 

Morgenstern, O., and J. von Neumann: “‘Theory of Games and Economic Behavior,” 
Princeton University Press, Princeton, N.J., 1947. 

Munroe, M. E.: “Theory of Probability,” McGraw-Hill Book Company, Inc., New 
York, 1951. 

Uspensky, J. V.: ‘Introduction to Mathematical Probability,” McGraw-Hill Book 
Company, Inc., New York, 1937. 


CHAPTER 10 


REAL-VARIABLE THEORY 


10.1. Introduction. We desire to preface the study of the real-number 
system by first introducing a brief view of our philosophy of mathematics 
and science. Historians are generally in agreement that in many ways 
Greek mathematics was the precursor to modern mathematics. One 
begins the study of geometry with a set of postulates or axioms along 
with some definitions of the fundamental entities in which one is inter- 
ested. From this beginning one deduces through the use of Aristotelian 
logic the many theorems concerning triangles, circles, etc. To the novice 
the axioms appear to be self-evident truths. We donot hold to this view: 
a postulate is nothing more than a man-invented rule of the game, the 
game in this case being mathematics. Actually the mathematician is 
not interested in the truth, per se, of his postulates. The postulates of 
a mathematical system can be chosen arbitrarily subject to the condition 
that they be consistent with each other. The proof of the self-consistency 
of the postulates is no easy task, if, indeed, such proofs exist. A dis- 
cussion of such matters lies beyond the scope of this text. 

Finally, we should lke to add that any theorems deducible from the 
postulates are a consequence of the postulates themselves. We admit 
this is a trivial and obvious remark. ‘Too often, however, the scientist 
forgets this fact. He is prone to believe that the laws of nature are dis- 
covered. It is our personal belief that this is not the case. We lean 
toward the more esoteric point of view that the so-called laws of nature 
are invented by man. Newton described the motion of the planet 
Mercury relative to the sun in a fairly accurate manner by use of f = ma 
and f = —(GmM/r*)r. Because of this one feels that these two equa- 
tions are laws of nature discovered by Newton. On the other hand, by 
postulating that a gravitational point mass yields a four-dimensional 
Riemannian space and that a particle moves along a geodesic of this 
space, Einstein also obtained the motion of Mercury relative to the sun. 
It is well known that Ejinstein’s predicted motion of Mercury is more 
accurate than Newton’s predicted motion. Are we to assume that from 
the time of Newton to Einstein we had a ‘‘true”’ law of nature and that 
now we discard this law and accept Einstein’s theory as the truth? It is 

373 


374 KLEMENTS OF PURE AND APPLIED MATHEMATICS 


just a matter of time before a physicist will invent a new set of postulates 
which will explain more fully the physical results obtained by experiment. 
Moreover it is not our belief that we continually approach the true laws 
of nature. 

10.2. The Positive Integers. We begin a study of the real-number sys- 
tem by first discussing the positive integers from a postulational point of 
view. The positive integers will be undefined in the sense that they can 
be anything the reader desires subject to the condition that they satisfy 
the postulates set forth below. We shall attempt to be rigorous, but 
shall not be as rigorous as might be possible. First. we do not feel quali- 
fied to attempt this task, and second it would require a treatise in itself 
to claborate on the logical and philosophical aspects and implications of 
the material upon which we shall discourse 

The positive integers have been characterized by Peano through the 
following postulates: 

P,: There exists a positive integer, called one, and written 1. 

P,: Every positive integer x has a unique successor wv’. 

P,: There is no positive integer 2 such that 2’ = 1. 

Pa: lf av’ = y’, then x = y. 

P,: Let S be any collection of positive integers. If S contains the 
integer 1, and if, moreover, any integer x of S implies that z’ is in S, then 
S is the complete collection of integers, /. 

The postulate P, guarantees that there is at least one member of our 
set of positive integers so that, we may all agree that there is something 
we can talk about. The equal sign (=) mvolved in the relation z = y 
of 2, means that x and y are identical clements in the sense that we may 
replace x by y or y by x in any operation concerning these integers. The 
postulate P, differs in some respects from the other postulates. P, is a 
rule for determining when we have the complete set of integers at hand, 
and is used chiefly in deducing new laws which the integers obey. 

As postulated above, the integers could be any sequence (ai, a2, ... , 
Gn, . . . ) which the reader has encountered in calculus courses. It is in 
this sense that the integers are undefined. It is obvious that more will 
have to be said before we can identify the collection of integers defined 
above with the ordinary integers of our everyday working world. We 
shall accept as intuitive the notion and idea of a set, or collection, of 
objects. The Peano postulates will make themselves clearer to the stu- 
dent as he proceeds to read the text. 

The operation of addition (+) relative to the positive integers is defined 


by the postulates 
Ag: a’ =a+l 


As: a+b’ = (a+ 5)’ 
The postulate A, implies that a + 0 is an integer (closure property). 


REAL-VARIABLE THEORY 375 


We are now in a position to deduce some new results concerning the 
positive integers. 

THEOREM 10.1. Addition is associative (a+b) +c=a+(b+c). 
We let S be the collection of integers z, y, z, . . . which satisfy Theorem 
10.1 for alla and b. Thuszisin Sif (a+b) +2 =a+ (6+ 2) for all 
integers a and b. We first show that 1 is an element of S. Now 


(a+b) +1 


(a + b)’ from A, 
a+b’ from A, 
a+ (b+ 1) from A, 


I 


so that 1 belongs to S. Now let x be any element of S. This means 
that (a + b) +2 =a+ (6+ 2) for alla and b. Now 


(a+b) +2’ = [(a+ b) +2)’ from A, 
fa + (b+ z)]’ since x isin S 
a+ (b+ 2)’ from A, 
a+(b+ 2’) from A, 


I 


so that x’ belongs to S whenever z isin S. From P,, S is the complete 
set of integers, S=/. Q.E.D. 
THEOREM 10.2. Addition 1s commutative a + 6 = b + a (see Prob. 1). 
THEOREM 10.3. Law of Cancellation. Ifa +b =a-+e, then b =. 
We proceed by induction ona. First, if 


l1+b=1+¢c 

then b+1i=c4+1 from Theorem 10.2 
bo =c' from A, 
b= from Pz 


Hence 1 belongs to S. Now let x be any element of S. This means that 
whenever x +b0=2-+c then b=c. Now assume xv’ +b=27' +0. 
Then 


b+a' =c4+a2' from Theorem 10.2 
(6+ 2)’ =(c+ 2)’ from A, 
b+an=c4+u2 from Pq 
ztb=arte from Theorem 10.2 
b=c since x isin S 


Thus z’ belongs to S whenever x belongstoS. From P,.,S = J. Q.E.D. 
TuHeoreM 10.4. Ifb +a=c+a, then b = c (See Prob. 2). 
In what follows we shall use the symbol (=) to mean ‘‘implies.”’ 
Thus Theorem 10.4 may be written b +a =c+a=>b=c. Implica- 
tion in both directions will be represented by (<=). Thus 


b+a=ct+aseb=c 


376 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


The reader should realize that b =c>b+a2=c+a does not mean 
that we have introduced the axiom about adding equals to equals. All 
we have done to the expression b + a is to replace b by c (permissible 
from the definition of equality of two integers). 

Equality (=) belongs to the class of relations, R, called equivalence 
relations. We note that 


a=a 
a=b<b=a 
a=bb=c>a=c 


In general a relation R is called an equtvalence relation relative to 
a set of elements (a, b,c, . . . ) if 
Ry: aka reflexive 


Re: altb = bRa symmetric 
Rs: altb, bRc = aRc transitive 


An equivalence relation R enables one to distinguish between various 
classes of elements. For example, we say that afb holds if and only 
if a and b yield the same remainder when divided by 2. Thus any integer 
belongs either to the class of even integers or to the class of odd integers. 
The reader should check that the elements of the class of even integers 
satisfy R,, Re, R3. We write akb =a = b mod 2. 

Now we introduce multiplication by the two postulates given below: 


Ma: a-l=a 
M;: a:b’ =a:b+a 


M, assumes that a - 6 is an integer for alla and b. We shall omit the dot 
whenever it is convenient todoso. Thus M;, can be written ab’ = ab + a. 

THEOREM 10.5. Multiplication is left-distributive with respect to addi- 
tion, a(b +c) = ab+ ac. We prove this theorem by induction on ec. 
Let S be the collection of integers (a, y, z, . . . ) such that 


afé +2) =ab+az 
for x in S and for all a and b. First 
a(b +1) = ab’ =ab+a=ab+a:l 
from Aq, M,. Thuslisin S. Now let xz bein S. We have 


a(b + x’) = a(b + 2)’ from Ay, 
=a(b+2x)+a from MM, 
= (ab + az) +a since x isin S 
= ab+ (ax + a) from Theorem 10.1 
= ab + az’ from M, 


so that x’ isin Sif zisin S. Thus S = J. 


REAL-VARIABLE THEORY 377 


THEOREM 10.6. 1-a = a (see Prob. 3). 

THEOREM 10.7. Multiplication is associative, (ab)c = a(bc) (see Prob. 
4), 

THEOREM 10.8. Multiplication 1s commutative, ab = ba (see Prob. 5). 

THEOREM 10.9. Multiplication 1s right-commutative, 


(6+ cla = ba + ca 
(see Prob. 6). 
DEFINITION 10.1. If a = b+, we say that a is greater than b or 
that b is less than a, written a > 6 or b < a, respectively. We can state 
four important ordering theorems regarding inequalities: 


O.: a, bgiven>a =bora>bora<b 
Or: a>b,ob>c>a>c 

O.: a>b=seat+c>b+ce 

Oa: a>b=>ac > be 


We leave these theorems as exercises for the reader (see Probs. 7 and 12). 
A set obeying O, 1s said to be totally ordered. 

To place the integers in the realm of our everyday working experiences, 
we now postulate that the integer 1, which occurred in the first of Peano’s 
postulates, Pa, will be the adjective which reasonable, sane, prudent 
human beings attach to single entities when they speak intelligently of 
one book, one day, one world, etc. Of course the number | as given in 
Peano’s postulates is a noun, not an adjective. B. Russell associates 
“one”? with the class of all single entities. Trouble occurs, however, 
when one tries to define a class of elements. 

The successor of one, namely, 1’ = 1+ 1, will be called the integer 
two, written 2, and addition will again mean what we wish it to mean, 
namely, that one book plus one book implies two books. We now feel 
free to speak of the class of positive integers 7(1, 2,3, ...,n,...). 


Problems 


. Prove Theorem 10.2. Hani: First prove that a + 1 = 1 + a for all a. 
Prove Theorem 10.4. 
Prove Theorem 10.6. 
Prove Theorem 10.7. 
Prove Theorem 10.8. 
Prove Theorem 10.9. 
Prove ordering theorems O,, O., Ou. 
. Prove that l1-a =a. 
. Prove that b’a = ba + a. 
10. Show that 1 S allintegersa. Hzint: Let S be the class of all integers x such that 
x21. Then show that S = J. 
11. If a’ > b, show that a > bora = b. 
12. Using the results of Probs. 10, 11, prove O,. 


Ce oe 


378 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


18. Prove that every nonempty set of integers contains a smallest integer. First 
define what is meant by a smallest integer of a collection of integers. Also use the 
results of Probs. 10, 11. 


10.3. The Rational Integers. An extension of the positive integers 
which includes the zero element and the negative integers is obtained in 
the following manner: Given any pair of positive integers, we postulate 
the existence of a difference function d which operates on the positive 
integers a, b subject to the condition that 


dia, b) = d(x, y) (10.1) 


if and only ifa + y =b+ 2. Equality as defined by (10.1) certainly 1s 
an equivalence relation. We note that the above definition depends only 
on addition of positive integers (Sec. 10.2). 

Let us see what properties are possessed by d. Obviously 


d(a, a) = d(b, 6) 
Moreover d(a +c, b+ c) = d(a, b). If we desire that d(a, b) behave 


like b — a of ordinary arithmetic, of necessity we are forced to postulate 
that addition be defined by 


d(a, b) + d(a, y) = d(a + 2,6 + y) (10.2) 
We now note that 
d(a, 6) + d(c, c) = dia + ¢,b +c) = d(a, b) (10.3) 


from (10.2) and (10.1). Hence d(c, c) has the property of being a zero 
element for the operation of addition (+). 
We define multiplication by the following rule: 


d(a, b)d(x, y) = d(ay + bx, by + ax) = d(z, y)d(a, b) (10.4) 
What of the number d(a, a + 1)? We see that 


d(x, y)d(a,a + 1)-= d(xa + x + ya, ax + ya + y) (10.5) 
d(x, y) = d(a,a + 1)d(z, y) " 


so that d(a, a + 1) has the unit property as regards multiplication. 

It now can easily be shown that the collection of elements d(1, x + 1) 
when x ranges over the set of positive integers will satisfy all the Peano 
postulates outlined in Sec. 10.2. The set of numbers d(a, 6) contains the 
positive integers as a subclass and in its totality represents the class 
of rational integers (positive integers, negative integers, and the zero 
element). We let the reader show that the zero element d(a, a) has the 
properties 

d(a, a)d(x, y) = d(a, a) (10.6) 
d(x, y) + d(y, x) = d(a, a) 


REAL-VARIABLE THEORY 379 


We say that d(y, x) is the negative of d(x, y), and conversely. We may, 
if we please, write d(a, a) = 0, dil, a+ 1) =a, d(a+1, 1) = —a, 
d(a, b) = b—a. The ordering postulates can easily be extended to 
cover the new collection d(a, b). We say that 


d(a, b) > d(a, y) 
if and only if 

b+a>at+y 
Let the reader show that 


d(a, b) > d(x, y) = d(a, b) + d(u, v) > d(x, y) + du, vr) 
d(a, b) > d(a, y) > d(a, b)d(z, wu + 2) > d(x, y)d(z, u +2) (10.7) 
d(a, b) > d(x, y) = d(a, y)d(u + z, 2) > d(a, b)d(u + 2, z) 


What is the ordinary meaning of (10.7)? 

Let us note some properties of the rational integers as outlined above. 
First we have a set of elements called the rational integers. Associated 
with this set of elements are two operations called addition and multipli- 
cation. The elements satisfy the following properties relative to the 
operations defined above: 

1. Closure properties: a + 6, ab are elements of our set when a, b are 
elements of the set. 

2. Associative laws: a + (b +c) = (a+ 6) +c, a(bc) = (ab)c. 

3. Unit elements:a +0 =a=O+a,a-1l1=a=1-a. 

4. Distributive laws: a(b + c) = ab + ac, (a + b)c = ac + be. 

5. An inverse element exists for each element relative to addition, 
that is, a + (—a) = (—a) +a = Ofor all a. 

We say that the rational integers form a ring with unit element 1. We 
also note that, if ab = 0, thena = Oorb=0. Why? 

We give an example of a ring such that ab = 0 does not imply a = 0 
orb = 0. Such rings are said to have zero divisors. Let us consider the 
six following classes: Into class one we place all integers which yield a 
remainder 1 upon division by 6. Class two contains all integers which 
yield a remainder 2 upon division by 6. Class7 has the obvious property 
that it consists of those integers which yield a remainder 7 upon division 
by 6. We represent these various classes by the symbols 1, 2, 3, 4, 5, 0. 
Addition and multiplication are defined in the ordinary way. Thus 
2-5 = 10 = 4 mod 6, since 10 yields a remainder of 4 when divided by 6. 
3+4=7=1 mod 6, etc. It is obvious that 2-3 = 0 mod 6, but 
neither 2 = 0 mod 6 nor 3 = 0 mod 6. We have constructed the ring of 
integers modulo 6. 

10.4. The Rational Numbers. We can extend the ring of rational 
integers by postulating the existence of a rational function R(a, b) where 
a, 6 are any two rational integers, a = 0. We postulate that RF is to 
have the following properties: 


380 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Ria, b) = R(a, y) = be = ay 

R(a, b) + R(a, y) = Raz, br + ay) 

Ria, b) R(x, y) = Riaz, by) (10.8) 
R(a, b) > R(a, y) & ba > ay if ax > 0 

Ria, b) > R(x, y) & bx < ay ifax <0 


We leave it to the reader to show that the set of elements R(1, b) 
has all the properties of the set of rational integers 6. For all practical 
considerations we may write R(1, b) = b, and we shall obviously write 
R(a, b) = b/a. The set of rationals has a further property not possessed 
by the ring of rational integers. For every FR(a, b), b # 0, a ¥ O, there 
exists a rational R(b, a) such that 


R(a, b)R(b, a) = R(ab, ba) = RO, 1) = 1 


Thus every nonzero element has an inverse with respect to multiplication. 
We say that the rationals form a field. 

The reader should have no trouble showing that the integers modulo 7 
form a field. The inverse of 2 is 4 since 2-4 = 8 = | mod 7. 

The rationals form a totally ordered field. We feel, however, that 
the rationals are not complete in the sense that we should like to deal 
with numbers like +/2, e, 7, etc. For the irrationality of +/2, see 
Sec. 10.5. We shall have to extend the field of rationals. This will be 
done in Sec. 10.6. 

10.5. Some Theorems Concerning the Integers 

THEorEM 10.10. Let f, g be positive integers. There exist integers 
h, r such that f = gh + 7,0 Sr <g. Our proof is by induction on f. 
Let f = 1. Then 


1=1-1+0 ifg = lsothath = 1,r=0 
1=g9°0¢+1 uig>l 


Now assume Theorem 10.10 holds for f. Then 


f=ghtr Or<g 
and fti=gh+(r4t]) 


Sincer <g,wehaver+1<gorr+1l=g. Ifr+1 < g, thetheo- 
rem holds. Ifr +1 = g, thenf +1 = g(h +1) + 0and Theorem 10.10 
holds. From P., the theorem is true for all integers f. We leave it as 
an exercise to show that the h, r of Theorem 10.10 are unique. 

THEOREM 10.11. We now derive a theorem regarding the greatest 
common divisor of two integers a, b. Consider the collection of integers 
(>0) of the form ax + by > 0. Since every collection of positive inte- 
gers has a least integer, there exist integers xo, yo such that 


axo + byo =d > 0 


REAL-VARIABLE THEORY 381 


is the least integer of our collection. We now show that d is the greatest 
common divisor. First assume that d does not divide a. Then 
a=ds+r,0< r<d from Theorem 10.10. Thus 


axzps + bys = ds =a-—r 


and (1 — xos)a + (—yos)b = r, a contradiction since r < d, unlessr = 0, 
in which case d divides a, d/a. Similarly d/b. Now any other divisor, 
d,, of a and b must divide azo + byo (why?), and hence d,/d. Therefore 
d is the greatest common divisor of a and 6b, written d = (a, b). Asan 
immediate corollary, if a and b are relatively prime, d = 1 and there 
exist integers x, y such that az + by = 1. 

THEOREM 10.12. If (a, b) = 1 and a/be, then a/c. 

Proof. ax + by = 1 since (a, 6) = 1, so that ave + byc = c. But 
be = ad since a/be. Hence a(zce + yd) = c, which in turn implies that 
a/c. 

THEOREM 10.13. The Fundamental Theorem of Arithmetic. An integer 
N can be written uniquely as a product of primes. 

Proof. Assume 


N = pipo > ** Pr = 192 °° * Ws 


It is obvious that we can consider the p,, g, aS primes since a nonprime 
factor can be broken into further factors. Continuing this process 
eventually reduces all factors of N into primes. Furthermore this can 
be done only a finite number of times since all primes are greater than 
or equal to 2. From above pi/qig2 °° + Qs. Lf (pi, m1) # 1, then py = qu. 
Why? If (pi, qi) = 1, then pi/qeqs + - - ge from Theorem 10.12. We 
continue this process until eventually p.i/g, for an z= 1, 2,..., 38. 
Since q, is assumed prime, we must have p; = q.. We now cancel these 
equal factors and continue our reasoning in the same manner. In this 
way we obviously exhaust all the p, and g,. Thus, except for the order, 
the number WN 1s factored uniquely as a product of primes. 

THEOREM 10.14. The +/2 is irrational. 

Proof. Assume +/2 rational so that +~/2 = p/g. From Theorem 
10.13 we have p = pip2 * * * Dry Y = 9192 °° * Ye, With the p,, q, unique. 
Thus upon squaring we have 


N = 291919292 °° * GeQe = PipiP2P2 * * * PrDr 


Since the prime factor 2 must occur an odd number of times on the left 
and can occur only an even number of times on the right, a contradiction 
occurs. Q.E.D. 

Problems 


1. Show that the collection of elements d(a, b) satisfies the Peano postulates as the 
elements a, b range over the Peano integers. 


382 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


2. Prove (10.5). 

8. Prove (10.6). 

4. Show that the set of elements R(1, b) has all the properties of the set of rational 
integers Db. 

5. Show that the integers modulo 7 form a field. 

6. Show that +p is irrational for p prime. 

7. Show that +/2 + 1/3 is irrational. 

8. Consider the set of polynomials with rational coefficients. Deduce theorems for 
these polynomials analogous to the theorems of Sec. 10.5. 


10.6. An Extension of the Rationals. The Real-number System. 
The rationals obey the two following rules: 

1. They form a field. 

2. They are totally ordered. 

On the other hand, the rationals are not complete. Pythagoras real- 
ized that there existed so-called numbers, +/2, ete., which are not rational 
numbers. If we wish to include the +/2 into our number system, we 
must enlarge the field of rationals. R. Dedekind gave a fairly satisfac- 
tory treatment of irrationals. We discuss an extension of the rationals 
which differs little from Dedekind’s approach and which 1s due to 
B Russell. 

DEFINITION 10.2. A Russell number is a set, A, of rationals having 
the following properties. 

1. FR contains only rationals and is nonempty. 

2. R does not contain all the rationals. 

3. If aisa rational in R and if 8 is a rational less than a, then 6 is in R. 

Two examples of Russell numbers are given as follows: 

1. The set of all rationals <1 1s a Russell number. It will turn out 
to be the unit of our constructed field of real numbers. 

2. The negative rationals and all positive rationals whose squares are 
less than 2 form a Russell set or number. We shall designate it by the 
symbol +/2. 

DEFINITION 10.3. Two Russell numbers are said to be equal or equiva- 
lent if and only if every rational of R, is contained in Re, and conversely, 
with one possible exception. 

The set of all rationals less than 2, which defines the Russell number, 
two, is equivalent to the Russell number consisting of all rationals less 
than or equal to 2. The only member of the second Russell number 
not found in the first Russell number is the rational two. 

DEFINITION 10.4. The Russell number A; is said to be greater than 
the Russell number Re, if, and only if, at least two rationals of R, are not 
members of Ro. 

We note that, given Ri and Ro, only one of Ri = Re, Ri > Ro, Ri < Re 
can occur. 

We must now define addition of two R numbers in such a way that the 


REAL-VARIABLE THEORY 383 


result will yield an R number. Let R, and R2 be two R numbers. Con- 
struct the set of rationals obtained by adding in all possible ways the 
rationals of R; with those of Re We leave it to the reader to show that 
this new set of rationals defines an R number. What property will the 
set of all rationals less than zero have? Is this Russell number needed 
fora field? Does every R number have a negative? Thestudent should 
readily answer these questions. 

It is a bit more difficult to define multiplication. We shall define 
multiplication for two Russell numbers, R,, Re which are greater than 
the zero Russell number. We first delete the negatives of R, and Rz. 
Then we multiply the positive rationals of A, with the positive rationals 
of R» in all possible ways. To this acquired set we adjoin the negative 
rationals. We leave it to the reader to show that the newly acquired 
set is a Russell number. An equivalent definition would be the following: 
Let C(R;), the complement of R,, be the set of rationals not in R;. Show 
that 


C(C(R1) > CUR») ] 


is a Russell number. By C(it;) -C(R2) we mean the complete set of 
rationals obtained by the product of rationals of C(R,) with rationals of 
('(R.) in all possible ways. We leave it to the reader to extend the 
definition for the other cases of R, and Ro. 

It is now an easy problem to show that the complete set of Russell 
numbers form a totally ordered field which contains the rationals as a 
subfield. Everything hinges essentially on the fact that we reduce our 
computations to the field of rationals. Let the reader show that the set 
of Russell numbers forms a totally ordered field. 

The field of Russell numbers has an additional property not possessed 
by the rationals. To exhibit this property, we first define the supremum 
(least upper bound) of a set of numbers (Russell). 

DEFINITION 10.5. The number s is said to be the supremum of a set of 
numbers S: (2, y, z, . . .) if and only if: 

l.s 2aforallamS. 

2. If ¢ < s, there exists an element y of S such that y > ¢t. We leave 
it to the reader to show that 1 is the supremum of the set of all rationals 
less than 1. Also, ~/2 is the supremum of the set of all rationals whose 
squares are less than 2. Let the reader show that a set of numbers can- 
not have two distinct suprema. Let the reader also define the infemum 
(greatest lower bound) of a set. 

DEFINITION 10.6. A set, S, of Russell numbers is bounded above if 
a, Russell number, N, exists such that N > z for all z in S. 

THEOREM 10.15. A supremum s exists for every set of numbers which 
has an upper bound. 


384 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Our proof is as follows: Let S be a set of Russell numbers bounded 
above by N. We construct a new Russell number as follows: Let s con- 
sist of all rationals which belong to any Russell number of S. We first 
show that s is a Russell number. 

1. s obviously contains only rationals and is not the empty set. 

2. Since N contains rationals not in any of the Russell numbers of S, 
there are rationals not in s. 

3. Let a be a rational of s and 6B a rational <a. Since a belongs to a 
Russell number, say, S; of S, then 6 also is in S; (why?), so that 8 belongs 
to s from the definition of s. 

Thus s is a Russell number (see Definition 10.5). We next demon- 
strate that s is the supremum of the set S. 

1. s 2 all numbers of S, for otherwise there is at least one element, 
S,, of S such that S; > s. This implies that S,; contains a rational not 
in s, which is impossible from the construction of s. 

2. Let 7 be a Russell number less than s, and let 6, y be elements of 
s not in r. This is possible since s > r. Now 8, y belong to at least 
one set, S;, of S from the construction of s. It is easy to see that S; > r. 

From Definition 10.5, s is the supremum of the set S. 

The set of Russell numbers thus 

1. Forms a field. 
(A) 2. Is totally ordered. 

3. Satisfies the property that a supremum exists for every set of 
numbers bounded above. 

Any set of elements satisfying the three properties of (A) is unique 
in the sense that they differ from the Russell numbers in name only. 
Thus the real-number system as set forth above is unique as regards its 
construction. 

To show this uniqueness, we let S and S be two totally ordered fields 
with the supremum property. We first match the zero elements and 
the unit elements. 


cl 


Oo 
lol 


Then we match the rational integers, 
22 
33 


neorn 


Then we match the rationals, 


Te > Tq 


REAL-VARIABLE THEORY 385 


Now consider any element, s, of S which has not as yet been put into 
one-to-one correspondence with an element of S. We consider the set 
of all rationals of S less than s. It is obvious that s is the supremum 
of this set. The rationals of S which correspond to the above-mentioned 
set of rationals of S will be bounded above. Why? A supremum & 
will exist for this set. We match s< 8. Thus every member of S can 
be mapped into a number of S. We leave it to the reader to show that 
every number of S is exhausted (mapped completely) by this method. 
Under the above mapping we realize that 


implies 





Such a correspondence, or mapping, 1s called an isomorphism, and it is in 
this sense that we say the sets S and S are equivalent. 

We can now prove the Archimidean ordering postulate by use of the 
supremum. Let 7r > 0, and consider the sequence of numbers 


es | eee, | ee 


We maintain that there is an integer m such that mr > 1. If this were 
not so, the sequence constructed above would be bounded above and so a 
supremum s would exist for the sequence. From the properties of s we 
have s 2 nr for all n. Now consider t = s —r<_s. An element pr 
exists such that pr >f=s-—r. Thus pr+r>sor(p+1)r>s, a 
contradiction. 

We list the Archimidean ordering postulates. Any one of the postiu- 
lates implies the existence of the others. 

AO: r>0=4 integer nD nr>1. (J = there exists, D = such 
that.) 

AOz:a>B>0=><39rationl rDa>r> B. 

AO3:a >0=>_drationalrDa>r> 0. 

AO,: (Eudoxus) If for every rational r > 8 it follows that r > a, 
then 8B 2 a. 

We now prove AO, from AO;. Ifa >B>0,thena—6B>0. From 
AO, an integer n exists such that n(a — 8) > 1 orna>nB+1. Now 
8 > Oso that an integer m exists such that m > Bn. Let m be the small- 
est of all such integers. Thus m > 6n 2m-— 1. Hence 


na>ngBt+l2am> np 
so that 


m 
a> >B 


386 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problems 


1. If 6b is a Russell number ~ 0, how does one find the number 67! such that 
bb"! = b-!b = 1? 

2. Extend the definition of multiplication of two Russell numbers for the cases for 
which both numbers are not positive. 

8. Show that the Russell numbers form a field. 

4. Show from the definition of the supremum that it is unique for a set of elements. 


5. Define the infemum of a set. 
6. Prove AO; from AQOy, AO, from AO3z, AO; from AO. 


10.7. Point-set Theory. For convenience, we shall consider only 
points or elements of the real-number system in what follows. Any set 
of real numbers will be called a linear set. The field of real numbers can 
be put into one-to-one correspondence with the points of a straight line in 
the usual fashion encountered in analytic geometry and the calculus. All 
the definitions and theorems here proved for linear sets can easily be 
extended to finite-dimensional spaces. 

DEFINITION 10.7. The set of points {x} satisfying a < x S b, a, b 
finite, will be called a closed interval. If we omit the end points, that is, 
consider those x which satisfy a < x < b, we say that the interval is 
open (open at both ends). For example, 0 S x S | is a closed interval, 
while 0 < x < 1 is an open imterval. 

DEFINITION 10.8. A linear set of points will be said to be bounded 
if there exists an open interval containing the set. It must be emphasized 
that the ends of the interval are to be finite numbers, thus excluding 
—o and +o. 

An alternative definition would be the following: A set S of real num- 
bers is bounded if there exists a finite number N such that —N <a2<N 
for all xin S. 

The set of numbers whose squares are less than 3 is certainly bounded, 
for if x? < 3, then obviously —2 < x < 2. However, the set of numbers 
whose cubes are less than 3 is unbounded, for x? < 3 is at, least satisfied 
by all the negative numbers. This set, however, is bounded above. By 
this we mean that there exists a finite number N such that x < N if 
x’ <3. Certainly N = 2 does the trick. Generally speaking, a set S 
of elements is bounded above if a finite number N exists such that 7 < N 
for all x in S. Let the reader frame a definition for sets bounded 
below. 

We shall, in the main, be concerned with sets which contain an infinite 
number of distinct points. The rational numbers in the interval 0 < x 
< 1 form such a collection. 

DEFINITION 10.9. Limit Point. A point p will be called a limit point 
of a set S if every open interval containing p contains an infinite number 
of distinct elements of S. 


REAL-VARIABLE THEORY 387 


For example, let S be the sequence of numbers 


me 1 ) 
Ae 2 Oe Se 


It is easy to verify that any open interval containing the origin O con- 
tains an infinite number of elements of S. Thus O is a limit point of 
the set S. Note that in this case the limit point O does not belong to S. 
It is also at once apparent that a set S containing only a finite number of 
points cannot have a limit point. 

Let the reader show that a point gq is not a limit point of a set S if 
at least one open interval containing g exists such that no points of S 
(except possibly g itself) are in this open interval. 

DEFINITION 10.10. Neighborhood. A neighborhood of a point is any 
open interval containing that point. A deleted neighborhood N, of p is 
a set of points belonging to a neighborhood of p with the point p removed. 
The set of points z such that gy < x < yy, 2 ¥ ov, 18 a deleted neighbor- 
hood of x = #o%. 

The definition of a limit point can be reframed to read: p is a limit 
point of a set S if every deleted neighborhood of p contains at least one 
point of S. 

DEFINITION 10.11. Interior Point. <A point :p is said to be an interior 
point of a set S if a neighborhood N, of p exists such that every element 
of N, belongs to S. Does p belong to S? If S is the set of points 0 S$ x 
< 1, then 0 and 1 are not interior points since every neighborhood of 
0 or 1 contains points that are notin S. All other points of this set, how- 
ever, are interior points. 

DerFINITION 10.12. Boundary Point. A point pisa boundary point of 
a set S if every neighborhood of p contains points in S and points not in S. 
If S is the set 0 S x S 1, then 0 and 1 are the only boundary points. A 
boundary point need not belong to the set. 1 is the only boundary point 
of the set S which consists of points x such that x > 1. 

DEFINITION 10.13. Lxtertor Point. A point p is an exterior point of 
a set S if it is not an interior or boundary point of S. 

DEFINITION 10.14. Complement of a Set. The complement of a set S 
is the set of points not in S. The complement C(S) has a relative mean- 
ing, for it depends on the set 7' in which S is embedded. If S, for example, 
is the set of real numbers —1 S a < 1, then the complement of S relative 
to the real-number system is the set of points |z| > 1. The complement 
of —1 < x <1 relative to the set —1 S x S 1 is the null set (no ele- 
ments). The complement of the set of rationals relative to the reals is 
the set of irrationals, and conversely. 

DEFINITION 10.15. Open Set. A set S is said to be open (not to be 
confused with open interval) if every point of S is an interior point of S. 


388 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


For example, the set of points {z} which satisfies either 0 < x <1 or 
6 < x < 8 is an open set. 

DEFINITION 10.16. Closed Set. A set which contains all its limit 
points is called a closed set. For example, the set 


1 1 1 ] 
(0,554 e 8 6 pe) 


is closed since its only limit point is 0, which it contains. 
DerinitTion 10.17. The set of all points (or elements) which belong 
to S; or Se is called the union (Si U Se) or sum (S; + S2) of the two sets 
S1, Se. The shaded area below is 
S; + S2 (see Fig. 10.1). The union, 
S = rs Sa, of any number of sets, 


7 {Sa}, is the set of all points {x}, x 


in at least one of the Sy. 
DEFINITION 10.18. The set of 
sus all points which belong to both S, 
1 2 e . 
and S:2 simultaneously is called the 
untersection, or product, written 
Sif) Se or 8;°S2, of the two sets 
S:, Se. Graphically, the shaded 
area is Sif) Se. 

The theory of sets and its appli- 
cation to logic and mathematics 
was given great impetus by the Eng- 
lish mathematician, George Boole 

SNS, (1815-1864). 
Fia 10.1 Twosets S;, Seare said to be equal, 
S; = Se, if every element of S, 
belongs to Se, and conversely. If S; contains every element of S2, we say 
that S2 is a subset of S,, writtén S, 0 S2. If S2is a subset of S; and if, 
furthermore, at least one element of S; is not in So, we say that Se is a 
proper subset of S:, written 


Si) S2 or S.C 8; 


ad 


Some obvious facts of set theory are: 
(1) A+B=B+A 

(2) (A+B)+C=A+(B4+0) 
(3) AB = BA 

(4) (AB)C = A(BC) 

(5) A+A=A,AA=A 

(6) AQGCA+8, ABCA, ABCB 


REAL-VARIABLE THEORY 389 


(7) ACC,BCCSALTBCC 

(8) AD C,BDCS>ABDC 

(9) ABC A(B+C) 7 

(10) A(B + C) = AB + AC 

We proceed to prove (10). Let x be any element of A(B + C). Then 
x belongs to A and to either B or C. Thus x belongs to either AB or AC 
so that x is an element of AB + AC. Thus A(B+C) C AB+ AC. 
Conversely, AB C A(B + C), AC C A(B 4+ C) from (9). From (7), 
AB + AC C A(B + C), so that A(B + C) = AB + AC. 


Problems 


1. What are the limit points of the set 0 S + <1? Is the set closed, open? What 
are the boundary points? 

2. The same as Prob. 1 with the point « = % removed. 

3. Show that the set of all boundary points (the boundary of a set) of a sect Sis a 
closed set 

4. Prove that the set of all limit points of a set S is a closed set. The sct S consisting 
of S and its limit points 1s called the closure of S. Is S a closed set? 

6. Prove that the complement of a closed set is an open set, and conversely. 

6. Why 1s a set S which contains only a finite number of elements a closed set? 

7. Prove that the intersection of any number of closed sets 1s a closed set. 

8. Prove that the umon of any number of open sets is an open set. 

9. A union of an infinite number of closed sets 1s not necessarily closed. Give an 
example which verifies this 

10. An intersection of an infinite number of open sets is not necessarily open. Give 
an example which verifies this 

11. Show that (4 + B) (A+ C) =A BC. 

12. The set of elements of A which are not in B is represented by A — B. Show 
that. 


A(B — (’) AB — AC 
(A — B) + (B —- A) (A + B) — AB 
A+(B-—-A)=A+B 
A(B — A) =0 the null set 


13. Show that, if A+ xX =S, AX =0, then X =((A). The null set (no 
elements) 1s represented by 0. 
14. If A Cc. S, we call S — A = ((A) the complement of A relative to S. Show 


that 
AC B#C(B) C C(A) 
C(A + B) = C(A)- C(B) 
C(AB) = C(A) + C(B) 


10.8. The Weierstrass-Bolzano Theorem. We are now in a position to 
determine a sufficient condition for the existence of a limit point. 

THEOREM 10.16. A limit point p exists for every infinite bounded 
linear set of points, S. 

The proof proceeds as follows: We construct a new set T. Into T we 
place all points which are less than an infinite number of points of S. 


390 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


T is not empty since S is bounded below. Moreover T is bounded above, 
for S is bounded above. From Theorem 10.15 a supremum p exists for 
the set T. We now show that p is a limit point of the set S. Consider 
any neighborhood of p, say, 


p-—-exxr<cpts e>0 5>0 


Since p — e < p, there is a member ¢ of T’ such thatt > p — «. Since tis 
less than an infinite number of elements of S (why?), p — ¢ is less than 
an infinite number of S. Thus an infinite number of points of S lie to 
the right of the point p — e. On the other hand, the point p + 6 has 
only a finite number of points of S greater than it, for otherwise p + 6 
would be a member of 7, a contradiction, since p is the supremum of the 
set 7’. Thus the open interval 


p-—-ex<xr<cpt+e 


contains an infinite number of elements of S. Since e, 6 are arbitrary, 
every open interval containing p contains an infinite number of elements 
of S. This proves the existence of a limit point p. 


Problems 


1. Show that the limit point p of Theorem 10.16 is the supremum of all limit 
points of S. 

2. Prove the existence of a least or smallest limit point for an infinite bounded lnear 
set S. 

8. Prove that a bounded monotonic increasing set of points has a unique limit. 
Show that the sequence 


m= (1 +5)’ n = 1, 2, 3, 
n 


has a unique limit. We call it e. 

4. Let S; and S2 be two closed and bounded linear sets. Consider the totality of 
distances obtained by finding the distance from any point s; in S; to any point s2 in Sy». 
Show that there exist two points 8), 8 of S; and Se, respectively, such that |& — §,| 
is the maximum distance between the two sets. 

5. Let S be a closed and bounded set. Define the diameter of the set, and show 
that two points si, s2 of S exist such that |se — s,| is the maximum distance between 
any two points of S. 

6. Consider the infinite dimension linear vector space whose elements are of the 


oo 


form a = (a), Go, .. , Gu, .. .), the a, real, such that > a? converges, written 
1=1 
> a< Ao 
t=] 
If b= (by bys. ¢ ba 4.) with > b? < B ¥ «, then by definition 


=] 


REAL-VARIABLE THEORY 391 
a+b = (a, + bi, ae + be, e 2% , On + by, e. & .) 


and za = (7@i, tao, ...,2Gn,...). We can define the distance p(a, b) between 
two elements of this space (Hilbert) by the formula 


p(a, b) = » (ba = an)? | 


n=] 


Show that p(a, b) converges. Hunt: Consider the sum 


> (Aa, + b,)? 2 O 


ns] 


or ® a?) + 2 (> anby » + > b> >= 0 
n=] n=] n=] 


and prove the Schwarz-Cauchy inequality, 


(D, asm)” = CD, at) CD #) 
n=] n= | n=] 
Also show that 
(1) p(a,b) =O@a=b 
(2) p(a, b) = pb, a) 
(3) p(a, b) + p(b, c) 2 pfa, c) 
Show that the infinite hounded set of points 


1 = d, 0, Q, so ) 0, Si 
en = (0, 1, 0, pO}: 2 s} 
er. = (0,0,0,.. ,1,0,.  .) 


cannot have a limit point. By a spherical neighborhood of a point 


P= (pi, Pe, » Du, ) 
we mean the set of points 

X = (71, Te, . lv. e ts a) 
which satisfy 


> (Ln = Dn)? < he? 


n=] 


7. Let L be a limit point of a set S. Show that one can pick out a sequence of ele- 
ments from S, say, 81, 82, 83, . . . , 8, ... , Such that 


lim s, = L 
n— © 


in the sense that every neighborhood of L contains an infinite number of the s,.. Fur- 
thermore 8,41 is closer to L than s,. We call such an approach a sequential approach 
to L. 


392 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


10.9. Theorem of Nested Sets. Let M1, Mo, ..., Mn, ... bea 
sequence of closed and bounded sets such that 14; ) M:;) M3; °--:- D 
M,> ::-:. Weshow that there is at least one point p which belongs 
to all the sets M,, Mo, ...,Mn,.... First we choose an element 7p, 
from M,, then pe from Mo, ... , px from M,, etc. The set of points 
(D1, Po, » » +», Dn, - - -) belongs to My. This set is bounded since M, is 
bounded. From the Weierstrass-Bolzano theorem the set has at least 
one limit point p which belongs to M,, since M, is closed. Similarly p is 


a limit point of the set (po, ps, ... , Dna, .-.-). These points belong to 
M, so that peM,. Finally p is a limit point of the set a Disa ed 2) 
so that peM,. Thus p belongs to each A7,, 7 = 1, 2, eS ef 


If the diameters of the sets 7, tend to zero, then p is unique (ace Prob. 5, 
Sec. 10.8) for the definition of the diameter of a set. 

Some mathematicians believe it necessary to postulate the existence of 
the set (pi, po, - - +», Dn, - - -)» More generally they desire the follow- 
ing postulate: Let S = {S,} be any collection of sets. Then there exists 
a set P of elements {p.} such that a point p, of P exists for each set S, of S, p, 
belonging to the set S,. This is essentially the ax7tom of choice (Zermelo). 

10.10. The Heine-Borel Theorem. [et S be any closed and bounded 
set, and let 7 be any collection of open intervals having the property 
that, if 2 is an element of S, then there exists an open interval 7’, of the 
collection T such that z is an element of 7’,. The Heine-Borel theorem 
states that there exists a subcollection 7’ of 7 which contains a finite 
number of open intervals and such that every element x of S is contained 
in one of the finite collection of open intervals that comprise 7”. 

Before proceeding to the proof we point out the following: 

1. Both the set S and the collection of open sets 7 are given before- 
hand, since it is a simple matter to choose a single open interval which 
completely covers a bounded set S. 


2. S is to be closed, for consider the set S(1, 3, 4, ...,I1/n,.. .), 
and let 7 consist of the following open intervals: 
I 
Pa Be is Vee, 
| iho 
T» ti | r 2 <3 
Ej 1 
Es 5 ’ | 
; bi ni ~ (n+ 1)? 


e e ee © e© ee ee © ee e© 8 = 6 86 «© «© © « « 


It 1s easy to see that we cannot reduce the covering of S by eliminating 
any of the given T,, for there is no overlapping of these open intervals. 
Kach T; is required to cover the point 1/7 of S. 


REAL-VARIABLE THEORY 393 


Proof of the Theorem. Let S be contained in the interval -N Sa SQN. 
This is possible since S is assumed bounded. Now divide this interval 
into the two intervals (1) -N sx 0, (2)0S 285 N. Any element 
x of S lies in either (1) or (2). If the origin O is in S, then O lies in both 
(1) and (2). The points of S in (1) form a closed sct, as do the points of 
Sin (2). Why? Now if the theorem is false, it will not be possible to 
cover the points of S in both (1) and (2) by a finite number of open 
intervals which form a subsct of 7. Thus the points of S in either (1) 
or (2), or possibly both, require an infinite number of covering sets of 7’. 
Assume the elements of S in (1) still require an infinite covering; call 
these points the set S;. Do the same for S; by subdividing the interval 
—N < x = 0 into two parts. Continue this process, repeating the 
argument used above. In this way we construct a sequence of sets 


DUO ce) eo) eo Bg ees 


such that each S, is a closed set (proof left to reader) and such that 
diameters of the S,—-0Oasn— © From the theorem of nested sets 
(Sec. 10.9) there exists a unique point p which is contaimed in every S,, 
7=1,2,..., and peS. Since - is in S, an open interval 7’, exists 
which covers p. This 7, has a finite nonzero diameter, so that eventu- 
ally one of the S,, say, S,, will be contained in 7’. Why? But by 
assumption all the elements of S,, require an infinite number of the {7} 
to cover them. This is a direct contradiction to the fact that a single 
T, covers them. Hence our original assumption was wrong, and the 
theorem is proved. 

Note that our proof was by contradiction. Certain mathematicians 
from the intuitional school of thought object to this type of proof. They 


wish to have the finite subset of 7, say, T1, Ts, ..., 7, exhibited, 
such that every point p of S belongs to at least one of the 7,, 7 = 1, 
Dy. acim, ee Gels 


Problem. Assuming the Heine-Borel theorem, show that the Weierstrass-Bolzano 
theorem holds. Hint: If S has no limit points, it is a closed set. If pis a point of S 
and 1s not a limit point, a neighborhood WN, of p exists such that N, contains no points 
of S other than p itself. Continue the proof. 


We now show that, if we assume the Heine-Borel theorem true for any 
linear point set, then the existence of the supremum can be established. 
Let S be an infinite bounded set of real numbers. Why can we dispense 
with the case for which S contains only a finite number of elements? 
Since S is bounded above by an integer N, we can speak of the set 7 
consisting of all numbers x such that x 2 every s of S,x SN. The 
set 7 is bounded. Is T closed? Yes, for if ¢is a limit point of 7’, and if 


394 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


t < 80, 8eS, then we can find a neighborhood of t which excludes so, so that 
the points of T in this neighborhood are less than so, acontradiction. Thus 
t = soeS, so that te7’, and T is closed. Every member of T satisfies the 
first criterion for the existence of a supremum. Hence, if no member of 
T is the actual supremum of the set S, the second criterion for the exist- 
ence of a supremum must be violated for every member of 7. Thus 
every point te7’ is contained in a neighborhood N;, such that no member 
of S is in this neighborhood. These neighborhoods form a covering of T’. 
Assuming the Heine-Borel theorem, we can remove all but a finite num- 
ber of these neighborhoods and still cover the set 7. Let the finite 
collection of neighborhoods be designated by 


N: aA<xr<b, VS DD @ 8. BT 


Let a, be the smallest of the a, 7 = 1, 2,...,7. We assert that a, 
belongs to 7. Let the reader verify this. Then the neighborhood to 
which a, originally belonged has the property that no point of S is in 
this neighborhood. This is a contradiction since those points to the left 
of a,(<a,) are not covered by the neighborhoods N. Hence there must 
be a supremum of the set S. Q.E.D. 

10.11. Cardinal Numbers. We conclude the study of point sets with a 
brief discussion of the Cantor theory of countable and uncountable sets. 

DEFINITION 10.19. If twosets A and B are such that a correspondence 
exists between their elements in such a way that for each x in A there 
corresponds a unique y in B, and conversely, we say that the two sets 
are in one-to-one correspondence and that they have the same cardinal 
number. 

Thus, without counting, a savage having 7 goats can make a fair trade 
with one having 7 wives. They need only pair each goat with each wife. 

DEFINITION 10.20. <A set in one-to-one correspondence with the set 
of integers (1, 2, 3, . . . , n) is said to have cardinal number n. 

DEFINITION 10.21. A set which can be put into one-to-one correspond- 
ence with the Peano set of positive integers is said to be countable, or 
denumerable, or countably infinite, or denumerably infinite. 

We note that the set of integers 1”, 2?, 3’, ... ,?, .. . is.countably 
infinite and at the same time is a subset of all the positive integers. 
The cardinal number of the positive integers is called aleph zero (No). 

DEFINITION 10.22. An infinite collection which is not denumerable is 
said to be uncountable, or nondenumerable. 

THEOREM 10.17. A countable collection of countable sets is a count- 
able set. Let the sets be S;, So, ...,Sn,.... This can be arranged 
since we have a countable collection of sets {S,}. Since the elements of 
S; are assumed countable, we may arrange them as follows: 


REAL-VARIABLE THEORY 395 


1 oJ 

Si: Q11, @12, Ai3, »- »- » 5 Ainw ss + 
Similarly Se: G21, Q22, Aes, . . » , Gon, - 

Ss: (31, A32, Azz, . - . » Aany - - - 

2 

Sa An}, QAn2, Ans, » Ann 


Now consider the collection of elements 
Qii1, A12, Ae1, A123, 31, Gee, -. © - 5 Ain, A2n—1), - © +» 5 Ants 2 ss 


This collection is countable since we have a first element, a second 
element, etc. But this sequence exhausts every element of {S,}. This 
is done by a diagonalization process. Thus the proof of the theorem is 
demonstrated. 

We can now easily prove that the set of rationals on the interval 
0 < x <1 iscountable. Into the sct S, we place all fractions m/2, 


QOsms1 4 A, 8). a eS 


There are a countable number of S, each containing a finite number of 
fractions. The complete set of all such fractions yields the set of rationals 
of (0 < x S 1), this set being countable. Now let the reader show that 
the set of rationals, {r}, for which — «© <r< o is a countable set. 
Cantor showed that the set of real numbers in the interval 0 S$ z S I 
is not a countable set. Assume the set countable. Then the elements of 


0 S x S 1 can be arranged in a sequence 


8S; = 0.01; Gy2 G13 °° * Ain’ * * 
So = Q0.d21 Ge2 Q23 °° * Gan * 
$, = 0.An1 QAn2 An3 Onn 


ee @ e© e@ e 8 e8® e® e*® © je© $e# je «6 e# oe «oe e# « 


where the @nn are the integers zero through 9. 
The s, 7 = 1, 2, 3,...,n,... are written in decimal notation. 
Now consider the number s = 0.b; by - - * bn + + + , where 
by = di. + 2 bs = doe + 2 ee ae Dy = ae tt 2 


the —2 occurring if a,; = 8 or 9. The number 0 is certainly not among 
any of thes, 7 = 1,2,3,...,n”,.. . since it differs from every one of 
these s;. But the collection {s;} was supposed to exhaust the real 
numbers on the interval 0 <2 <1. Thus a contradiction occurs, and 
the real numbers are not countable. 


396 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problems 


1. Show that the set of algebraic numbers, those numbers which satisfy polynomial 
equations with integral coefficients, is a countable set. 

2. Why are the irrationals uncountable? 

8. Consider the Cantor set obtained as follows: From the interval 0 S x S 1 delete 
the middle third, leaving the end points 3, 4. Repeat this process for the remaining 
intervals, always keeping the end points. The set that remains after a countable 
number of such operations 1s the Cantor set. Since each removed sct 1s open, show 
that the Cantor set 1s a closed set. If the numbers 0 S$ x ¥< 1 are written as decimal 
expansions to the base 3, show that the Cantor set can be put into one-to-one corre- 
spondence with the set of reals on 0 S$ z <1 when these numbers are written as 
decimal expansions to the base 2. Show that every point of the Cantor set 1s a limit 


point of the set. 

4. Show that a limit point exists for a nondenumerable collection S of linear points. 
The set need not be bounded. Hint: Consider the intervals N s 7 < N + 1 for the 
rational integers N. At least one of these intervals must contain a nondenumerable 


number of S.. Why? 
5. Show that a limit point exists and belongs to a nondenumerable collection S of 


linear points. 


10.12. Functions of a Real Variable. Limits and Continuity. Let G 
be any linear set of real numbers. G may be an open or closed set, a finite 
set of pots, the continuum, a closed interval, ete. We writeG = {7}, 
where z is anv element of G. Assume that to each x of G there corre- 
sponds a unique real number y, x y, and let H = {y} be the totality of 
real numbers obtained through the correspondence. This mapping of 
the set G into the set /7 defines a real-valued function, usually written 


y = f(x) xinG 


We say that y is a function of x in the sense that for each x there exists 
a rule for determining the corresponding y. The notation f(z) simply 
means that f operates on x according to some predetermined rule. For 
example, if y = f(x) = 2*?, —3 S x < 5, the rule for determining y given 
any x of the set —3 S x < 5 simply is to square the given number gz. 
Consider Table 10.1. The set G consists of the numbers xz = —1, 2, 5, 14, 


TaBLE 10.1 





and the set H consists of the corresponding values y = 3, 8, —2, 0, 
respectively. The table defines y = f(r), zinG. The reader is familiar 
with the use of cartesian coordinates for obtaining a visual picture of 


y = f(z). 


REAL-VARIABLE THEORY 397 


An important concept in the study of functions is the limit process. 
One reason why the student of elementary calculus encounters difficulty 
in understanding the theory of limits is simply that the concept of a 
limit rarely occurs in physical reality. The statement lim 2? = 2} 1s 


L—> 2X0 
quite confusing to beginners. To the novice this statement appears to 
be trivially true, for is not 2? = 72 when xz = 20? It must be understood 
that the statement x — zx» is an abstract entity invented by man. We 
visualize a variable z which is orced to approach the constant 2» in such 
a manner that the difference between zx and x» tends to zero. This leads 
us to the following definition: 

DEFINITION 10.23. An infinitesimal is a variable which approaches 
zero aS a limit. 7 is said to be an infinitesimal if |n| becomes and remains 
less than any preassigned e > 0. 

If 71 and nz are infinitesimals, then eventually |ni| < €/2, |n2| < €/2, 
for arbitrary e > 0, so that |71 + m2] S || + |nel < ¢, and m1 + 72 is also 
an infinitesimal. Let the reader show that the difference and product 
of two infinitesimals are again infinitesimals and that the product of a 
bounded function with an infinitesimal is again an infinitesimal. Let x 


be a variable which in succession takes on the values 1, 4, 3, --..-, 
1/n, .... Certainly x is an infinitesimal since 1/n — 0 as n becomes 
infinite. For any value of x thesumS =2+2a+2+4-°:-: - is infinite. 


On the other hand each term of S tends to zero. Hence an infinite sum 
of infinitesimals need not be an infinitesimal. For another example, let x 
be an infinitesimal, so that x?/(1 + 2x?)" is also an infinitesimal for n = 0, 
12,4 <%4 Now 


S = eee cea 
eee 
n=O 


so that S is also an infinitesimal. An infinite sum of infinitesimals may 
or may not be an infinitesimal. Let the reader show by specific examples 
that the quotient of two infinitesimals may or may not be an infinitesimal. 

Let us return to the statement lim x? = 22. To show that this 


t— XO 
statement holds, we must prove that x? — 2? 1s an infinitesimal whenever 
x — 2x9 is an infinitesimal. Now |x? — 232| = |x — aol: |x + aol, and 


|x + xo| is bounded for x near x. Since |x — xo| is an infinitesimal (by 
our choice), of necessity x? — x? is an infinitesimal, so that x? — z2— 0, 
or x? — x2, whenever x — Zo. 
Let the reader prove the following results: If lim f(z) = M, 
ta 


lim g(z) = N 


ra 


398 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


then 
(1) lim [f(z) + g(z)]) = M4N 


(2) lim [f(z)g(@)] = MN 


. Ste} _M 
@) tim |B |=] xO 
A knowledge of limits enables us to introduce the concept of continuity 

of a function. We consider the functional relationship y = f(x), and 

we let x = xo be a point of G with yo = f(x0). What shall we mean when 
we say that f(z) is continuous at x = zo? = Intuitively we feel that, if x is 
any point of G which is close to xo, then f(x) will be continuous at z = zo 
if the corresponding value y will be close to yo. Actually we desire 

y y — yo to be an infinitesimal if 

x — Xo 18 an infinitesimal. 
DEFINITION 10.24. f(x) is said 

to be continuous at xz = Xo if for 

any « > 0 there exists a 5(e) > 0 

such that 


\f(z) — f(zo)| < 


whenever |z — xo| < 6, z inG. 

Let us look at the graph of Fig. 
10.2. For any e > O we construct 
the horizontal lnes y = y+ .«, 
y=yo—e. If wecan draw two dis- 
tinct vertical lines x = zo + 6,2 = x» — 6 such that, for every point z of G 
in the open interval xp — 6 < © < 4 + 4, f(z) has a numerical value lying 
between f(xo) — ¢€, f(zo) + €, we say that f(x) 1s continuous at x = 2. 
The distinct vertical lines must exist for every « > 0. Note that the 
maximum size of 6 will depend one. The smaller ¢ is, the smaller 6 will 
have to be. Moreover the point zo will influence the value of 6. The 
steeper the curve, the smaller 6 becomes. 

Two equivalent definitions of continuity are as follows: f(z) is con- 
tinuous at x = Zo if: 


(a) lim f(r) = f(ao), z inG. 


(b) For any e > 0 there exists a 6 neighborhood of xo such that |f(21) 
— f(x2)| < e whenever x, and 22 are inG and belong to the 6 neighborhood 
of L0. 

Functions are discontinuous for two reasons: 

1. lim f(z) does not exist. 


t—> Xo 


2. lim f(x) exists but is not equal to f(zo). 
xr—> ZO 





Fic. 10.2 


REAL-VARIABLE THEORY 399 
Example 10.1. Consider f(z) =z +1, 2 ¥ 2, f(2) = 5. We have 
lm (2 +1) = 


x2 
However f(2) = 5 # 3, so that f(z) is discontinuous at z = 2. If we were to redefine 
the value of f(z) at z = 2 to be 3 instead of 5, we would remove the discontinuity. In 
this example f(z) has a removable type of discontinuity. 


1/z 
Example 10.2. Consider f(z) = away xz £0, f(0) = 1. We have 


ieee 1 | a6 ( 
pol + 2% — ola 
x>0 r<0 


Hence lim f(z) does not exist for all manner of approach of r to zero It is obvious 


x0 
that f(z) does not have a removable type of discontinuity at x = 0. 
Example 10.3. f(z) = sin (1/z), « # 0, f(0) = 2. Since f(z) oscillates between 
—land +lasz— 0, ae f(z) does not exist and f(z) is discontinuous at z = 0. 
—0 


Example 10.4. f(z) = 1/z, x ~ 0, f(0) = 41 Some authors consider that 1/z 
is discontinuous at x = 0 since division by zero 1s undefined. In this example 


f(z) is discontinuous at x = 0 since lim 1/z does not exist. On the other hand 
xr— 0 


f(z) = (#? — 1)/@ — 1), z #0, f(0) = 2 is continuous at x = 1 even though 
(7? — 1)/(@ — 1) is undefined at x = 1. 

Example 10.5. If f(x) = 2? forO S xz S 1, we note that f(x) 1s continuous every- 
where on this interval. Why is f(z) continuous atx = Qandz=1? Ifzv@=cisa 
point of this interval, we have |r? — c*#| = |r —c|-|z + cl < 3]ce —c| so that 
|z? — c®| < ¢ whenever |z — c| < e€/3 = 5. We have found a value of 5 independent 
of the point z = c. We say, therefore, that f(x) = 7718 untformly continuous on the 
intervalO Sz <1. 

Example 10.6. Let f(z) = 1/efora $z S1,a>0. Itis easily seen that f(z) is 
continuous for every z of this range. For x = c we have 


z—Cc 
=C 





20 
c 





~3 lr — ef 


& | 











Thus 


ent of z = c, we have ibn continuity. On the other hand f(z) is also continuous 
at every point of the intervalO <2 S11. Butf(z) is not uniformly continuous on a 


- --3 <eif Sle —c| <eor|z —c| <a% = 6. Sinced = ais independ- 





range since 5— Oasa—0. Foranye > 0 we cannot finda 6 > Osuch that | == | 
< « whenever |x — c| < 6 for all c on the range 0 <2zS1. Note that the ieee 
0 <x S 1 is not a closed set. 


DEFINITION 10.25. Let f(x) be defined over the range G. f(z) is 
said to be uniformly continuous on G if for every e > 0 there exists a 
= §(e) such that 


\f(1) — f(z2)| <e 
whenever |r; — 22| < 6, 2; and ze in G. 


THEOREM 10.18. If G is a closed and bounded set and if f(x) is con- 
tinuous at every point of G, then f(z) is uniformly continuous on G. 


400 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Proof. We base the proof of this theorem on the Weierstrass-Bolzano 
theorem. First let us assume that f(z) is not uniformly continuous on 
the closed and bounded set G. This means that there exists at least one 
€ = eo such that for any 6 > O there will exist 2;, %; such that |f(ai) — 
f(Z1)| > €0 with |x, — %,| < 6. We take 6 = 1 and obtain the pair of 
numbers x1, Z;. Then we take 6 = 4 and obtain the pair (x, Z2). For 
5 = 1/n we have the pair (rn, Zn) such that |f(z,) — f(Zn)| > eo with 
tn — Zn| < 1/n = 6. By this process we generate the two sequences 
(x1, Zo, . +», Bn,» ++), (Ei, Zo, ..., En, -~-.). Since G is a closed 
and bounded set, the sequence (21, to, . . . , Xn, - - -) has a limit point 
x =c belonging to G. We pick out a subsequence which converges 
sequentially toz = c. The corresponding points from the sequence {%,} 
have the samelimit,7 = c,since|z, — Z,| <1/n. Let these twosequences 
be designated by {z/,}, {z/}. Since f(z) is continuous at x = c, we have 
f(x) — f(e)| < 0/2, | f(z) — f(c)| < 0/2, for n sufficiently large. Thus 
f(x) — f(z,)| < eo, a contradiction, since for all pairs z/, Z/, |f(zi) — 
f(Z%1)| > 6. Q.E.D. 

THEOREM 10.19. If f(z) and g(x) are continuous at x =a, then 
f(x) + g(x), f(x)g(x), f(x)/g(x) with g(a) ¥ 0, are continuous at x = a. 
The proof is left to the reader. 

THEOREM 10.20. If f(x) 1s continuous over a closed and bounded set 
G, then f(z) is bounded. 

Proof. Assume f(x) is not bounded. There exists an x, such that 
f(t:) > 1, an zz such that f(xe) > 2, . . . , an x, such that f(x,) > n, 

The set (1, 22, ...-, Yn, -- .-) has a limit point Z belonging 
to G since G is a closed and bounded set. There exists a subsequence 
(tj, ©, ---, 2, .. .) which converges sequentially to . Thus from 


continuity 

lim f(s) = f@) 
But f(z,) —~ © as zx, — Z, a contradiction, since f(x) has a specific real 
value at x = i. : 

THEOREM 10.21. If f(x) is continuous on a closed and bounded set 
G, then a point Zo of G exists such that f(xo) 2 f(x) for all x of G. 

Proof. From Theorem 10.20 f(z) is bounded above as x ranges over G. 
Thus the set of values {f(r)} has a supremum, say, S. Thus S 2 f(z) 
for all z of G. Choose x; such that f(11) > S — 1, x2 such that f(x2) > 
S — zg, ..., 2, Such that f(z,) > S— 1/n,.... Why is this possi- 
ble? The set (41, v2, .. . ,2n, -- -) has a limit Z belonging toG. We 
pick out a subsequence {2z,} which converges sequentially to Z, with 
In = Im Now f(tm) > S — 1/m, so that 

lin f@) = tim (s = 1) aris 


m—> © 


REAL-VARIABLE THEORY 401 
From continuity, lim f(z,) = lim f(rm) = f(#), so that f(z) 2 S. But 


S = f(z), so that f(z) = 8S. Q.E.D. 

THEOREM 10.22. Let f(z) be continuous on the rangea Sab. If 
f(a) < 0, f(b) > O, there exists a point c such that f(c) = O witha <c < b. 

Proof. Let T be the set of pomts of a S x S bsuch that f(z) < 0, and 
if f(v1) < 0, 2 < a, then f(x2) < 0. We speak only of x on the range 
asx«sxzb. T is not empty since z = a belongs to T. T is bounded 
above since x = 6 does not belong to 7. Hence T has a supremum; 
eallitz =c. Now f(c) = 0, or f(c) > 0, or f(c) < 0. Assume f(c) > 0. 
Since f(x) is continuous at x = c, we can find a 6 such that 


f(a) — fle) <= 


whenever |x — c| < 6,6 > 0. Thus f(x) > f(c)/2 > 0 whenever |x — e| 
<6. Butf(x) < Oforz < c,a contradiction, so that f(c) S$ 0 Let the 
reader show that f(c) < 0 cannot occur, so that f(c) = 0. 





Problems 


1. For the following functions find a 6 = 6(e) such that |f(a1) — f(x2)) < «whenever 





lz1 — 2} < 6: 
(a) f(r) = 2x8 for -Il Sr<l 
(b) f(r) = sin x forO Sr S$ 2r 
(c) f(x) = W1 — 2? for -l1<a2<1 


2. Show that g(r) = z sin (1/r), « ¥ 0, f(0) = 0, is continuous at z = 0. 

3. Show that y = sin (1/r) 1s not uniformly continuous on the range 0 < az < 1. 

4. Find an example of a function f(z) which 1s continuous on a bounded set, but f(x) 
is not bounded. 

6. Find an example of a function f(z) which 1s continuous on a closed set, but f(z) 
is not bounded. 

6. Prove Theorem 10.19. 

7. Discuss the continuity of f(r) = 0 when 2 1s rational, f(z) = x when = is irra- 
tional, f(0) = 0. 

8. Let f(z) = 0 when vr 1s irrational, f(z) = 1/q when x = p/gq, (p,q) = 1. Show 
that f(x) is discontinuous at the rational points and 1s continuous at the irrational 
points. Hint: Show that there are only a finite number of rational numbers p/gq with 
(p, g) = 1 such that 1/q > ¢ for any e > 0. 

9. Prove Theorem 10.18 by making use of the Heine-Borel theorem. 

10. Prove Theorem 10.20 by making use of Theorem 10.18. 

11. Let f(z) = g(x) for x rational. If f(z) and g(x) are continuous for all z, show 
that f(x) = g(x) for all =. 

12. Let f(x) and g(x) be continuous on the rangea Sx Sb. Let h(x) = f(z) if 
f(z) 2 g(a), h(x) = g(x) if gv) 2 f(x). Show that 


h(x) = ¥(f(z) + g(z)) + alf(z) — g(x)| 


Also show that h(x) is continuous fora Sz S b. 
18. If f(x) is defined only for a finite set of values of x, show that f(x) is continuous 
at these points. 


402 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


14. If f(z) is continuous at x = ¢c with f(c) > 0, show that a 6 > 0 exists such that, 
for |x — cl < 4, f(x) > 0, cinG. 

15. If f(r) is continuous on a closed and bounded set G, show that a point 20 of G 
exists such that f(z0o) S f(x) for all x in G. 


10.13. The Derivative. Let f(z) be defined on a range G, and let c be 
a limit point of G. We consider any sequence of G which Sorueiees toc, 
say, {tn}. Next we compute 


lim I(tn) = Ale) (10.9) 


Ince in C 


If this limit exists and is independent of the particular sequence which 
converges to c, we say that f(x) is differentiable at x = c. We represent 
this limit or first derivative by f’(c). 

The definition above is equivalent to the statement that for any e > 0 
there exists a 6 > 0 and a number, 
f'(e), such that 


f(z) a I(c) — f'(c) 


r—C 


whenever 0<|x — c| < 6,2 inG. 

It follows almost immediately that 
if f’(c) exists then f(z) is continuous at, 
a = ¢, for 


f(z) — fle) — f'(e)(e — €)| 4 0 


as r — c implies f(z) — f(c) asx — c. 
The reader is well aware of the geometric interpretation of the first 
derivative (see Fig. 10.3). In Fig. 10.3, 





PR = Az RQ = f(x + Ax) — f(x) 


fle + ae) — fe 


the slope of the secant line L joining P to Q is ——— » the slope 


of the tangent line 7' is defined as f’(z), and RS = ae Ax is called the 
differential of f(x). 


Example 10.7. Let f(z) = x? sin (1/xr), x # 0, f(0) = 0. To see whether f(z) is 
differentiable at x = 0, we compute 


lim f(z) — FO) = lim x? sin (1/x) _ lim x sin 2 = 0 
x 


r0 x-O0 20 x r—>0 


Hence f’(0) = 0. Let the reader show that f’(z) is not continuous at z = 0 by showing 


that lim f’(7) does not exist. 
x0 


REAL-VARIABLE THEORY 403 


The reader can verify that, if f(z) and g(x) are differentiable at x = c, then 


< (f(a) + a@lene =f) +90 


¢ [f(x)g(z)lrmce = f(e)g’(c) + g(c)f'(c) (10.10) 
d [ f(z) _ gle)fi(e) — fle)y’(e) 
zoe 7 foc) |2 © g{e) #0 


The following differentiation rules are known to hold: 





d (constant) d vos r ; 
-— = () = —sing 
du da 
dr™ pena det : 
——_ = —— == e 
da dc 
dsinz | ee dinz 1 
ax dx r 


Hxample 10.8. Implicit Differentiation. If y = f(w) and wu = g(z), we say that y 
af 


depends implicitly on x. Let wo = ¢(%o), and assume that aa exists at wu = uo, oe 
exists at z = zo. We show that 

dy _ afde _ dydu 

dt dudx dudz 


at zr = 20, uv = Uo. From 
Ay = f(uo + Au) — f(uo) 
Au = e(ro + Ar) — ¢(7o) 


we have 





Ay _ fluo + Au) — fluo) e(to + AX) — e(to) Au # 0) (10.11 
Spe Au Ax Ar #0 oy) 


If, as Az -> 0, Au does not vanish infinitely often, then eventually Au # 0, and 


dy 


dy _ im SY = 
U= UD 


—— ee 


dx Aro>o de = du 








x= Xo 


since the limit of a product is the product of the respective limits, provided they exist. 
On the other hand, we cannot apply (10.11) if Aw = 0 infinitely often as Ax > 0. For 


this case, however, i lm i 0. But for those values of Ax for which Au = 0 
dx Az—OQ 
_ Hus) = fla) _ Ay _ dy _ dy du 
we have —Y a a 0, so that on a 0. Hence ae ad still holds 


since both sides of this equation vanish. 
Example 10.9. The Rule of Levbnitz. The derivative of a product u(z)v(z) is 


£ dv du 
a “ az =v dx 


; ; ; . d? dy du dv dy, 
If we differentiate again, we obtain ant (uv) = UT + 2 ds + vat Let the 


404 ELEMENTS OF PURE AND APPLIED MATHEMATICS , 


reader prove by mathematical induction or otherwise that 


d"y (1) du d"~ly d™—ly + \ es d?u d*- ee 
a =, (uw) =U ont Ti) a agi + 2) dzt dz + 








n 
n\ déu d*~*y 
a y Qa (10.12) 
k=0 
7 du dy ; 
with —— =u, — =». As an example of Leibnitz’s rule we consider the Legendre 
dx° dx” 
olynomials P,,(z), defined as Pa(z) = el Dy, Po(ez) = 1,n = 0,1,2 
p y nN 3 n ann! dz” ’ poe) 
We have 
1 dad 1 ad 
D> (, eee ae 2 eee n Dx. nl 
P,(2) ES (x 1) aah ae (Qna(x 1)"—4J 
1 a"- ~2 
or es 2 a n—y} oa as a | 
| | Qn j= = 1)"-! + 2n(n — 1) . mer 1) | 
_— =. 1 d~? 2 n—1 
= 2Pn-a(x) + 2°-1(n — 1)! dz"? Co 1) 
oe i 
MMe) go) 4 BL ae) +(e — DPaale) 
== _ tac + nPx_-1(x) (10 13) 
Moreover 
Pi(2) = p24 Me? — DO? = DY 
‘ Qn! + r” 





(x2 — | 


+n(n — ) = 
ee ae) n—1 an 


— an ee + Bi. 1(z) a a Q"(n — es 1)! dx 2 


iPr) J aP,,. ale dP, 4(2) 
Mg 22) + Pyle) +e aoe 


(2 — ])"- 1 








9 mp: (7) (1074) 


Combining (10.13) with (10.14) Yields 


£. | = 2 a te) SiGe PAGS O 


or £ | a — z?) ats) +n(n+1)Pr(z) = 0 (10.15) 


Equation (10.15) is Legendre’s differential equation (see Sec. 5.13). 


From Theorem 10.21 we readily prove Theorem 10.23 due to Rolle. 

THEOREM 10.23. Rolle’s Theorem. Let f(x) be continuous in the 
interval a < x S b, and let f(x) possess a derivative at every point of 
this interval. If f(a) = f(b) = 0, there exists a point x = c,a <c < b, 
such that f’(c) = 


REAL-VARIABLE THEORY 405 


Proof. From Theorem 10.21 there exists at least one point x = c such 
that f(c) 2 f(x) fora Sx Sb. Now 
f'(c) = lim * J Me FAH g I) 


nO 


since f(c + h) S f(c). Moreover 
f'(e) = him er HO) h) = Fe) 


; pe 


Of necessity, f’(c) = 0. Q.E.D. The condition of the theorem can 
be weakened by assuming that f’(c) exist at z =, with f(c) a local 
supremum or infemum of f(x). Let the reader give a geometric interpre- 
tation of Rolle’s theorem, and let him construct an example of a continu- 
ous function, f(z), f(a) = f(b) = 0, such that nowhere is f’(x) = 0, 
asxzsb. 

Law of the Mean. Let f(x) and g(x) be differentiable functions on 
the interval a < x = b. We show that a point z = c exists such that 


[o(b) — ela)lf'(c) = [f(b) — fla)le’o) (10.16) 


= 0) 


The function 


v(x) = o(z)lf(b) — f(a)] — f@)le() — e(a)] — e(a)f(b) + ¢(b)f(@) 


vanishes for z =a and x= 0b. y(x) was obtained by finding A, B, C 
such that y(z) = Ag(x) + Bf(x) + Cvanishesforz = a,x = b. Apply- 
ing Rolle’s theorem to (x) yields the theorem of the mean. As a special 
case, choose g(x) = 2, so that. (10.16) becomes 


f(b) — fla) =f'(o(b-a) aK<c<d (10.17) 


Let the reader give a geometric interpretation of (10.17). 
L’Hospital’s Rule. From (10.16) we have 


fo) ~f@ _ fe) 
y(b) — o(a) — ¢(c) 


seo tidet g(b) — g(a) # 0, v’(c) #0. Let us assume that f(a) = 0, 
g(a) = 0, so that 





a<c<b (10.18) 





f(b) _ fie) 
p(b) = (ce) a<c< b 
and ie el (10.19) 


roa 9(d) pra GC) cara ¥'(C) 


sincea <c< b. Wecan rewrite (10.19) as 





lim 2‘) f'(z) (10.20) 


IND) A 
ay g(x) se gy ‘(z) 


406 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


which is L’Hospital’s rule. We cannot apply (10.20) unless f(a) = 0, 
g(a) = 0. What can be done if f’(a) = 0, g(a) = 0? = If f(x) and ¢’(z) 
are continuous at x = a, and if y’(a) ¥ 0, then (10.20) can be written 


lim £2) . J) (10.21) 


2a g(x) ge ‘(a) 
provided f(a) = 0, g(a) = 0. 
Kzxample 10.10 








sinz—az x3/6 cos x — | x?/2 
lim sel = lim -~— ee na 2 
z—0 t r—0) or 


Now g(z) = cosr — 1 + 2?/2and h(r) = 5x4 both vanish at x = 0 and are differenti- 
able. Reapplying (10.20) yields 


; ee 3 
lim 222 = et 2/6 _ im 


—snz+2 
z—0 x z—0 20 





sin . cosz 1 


=lC(C eh 





a a0: ~ 120-120 


We complete this section with a discussion of the second derivative. 
Let f(x) be differentiable in a neighborhood of x = c. The second deriva- 
tive of f(x) at. x = c is defined as 


m 2 EAC) 


—+0 


f"(c) eae ] 


provided this limit exists. Now f’(r) = im (z t HN A) — J) so that. 


P(e) = timLE AD =O pee py = mf LETH — fe +H 
h-C 
ae f(le+kt+ "= lls + + k) te a ) fle f(c) 
ho 
and f’’(c) = lim CR inrerenio gene omnis : 
— tim im OEE +H — fle +b) — fle +h) +O 
k-0 h-0 hk 


The order in which we take these limits is important. It can be shown, 
however, that under certain conditions we can replace k by h so that only 
one limit process is necessary. We would thus have 


f"(c) = lim + 2h) — ate + h) + fle) (10.22) 





We now investigate under what conditions (10.22) yields f’’(c). Let 
U(c) = fic +h) —f(c), Uie +h) = f(e + 2h) — f(c + h), so that 


U(e + h) — U(e) = fle + 2h) — 2f(e +h) + fo) 


REAL-VARIABLE THEORY 407 
Applying the law of the mean as given by (10.17) yields 
hU'() = hIf'(E + h) — f'(@)] = fle + 2h) — 2f(e + h) + fle) 
Applying (10.17) once more yields 
h*f'(€) = fle + 2h) — 2f(e + h) + fle) (10.23) 


with ec <€<ec+h. To obtain (10.23), we assumed that f(x) exists 
in a neighborhood of x = c. If, moreover, f’’(x) is continuous at x = c, 
we note that (10.22) results. 





Problems 
1. Derive the results of (10.10) 
2. Show that ify = 2”/", m and n integers, then cu = = 2™/™-l by assuming 
dx” aunt 
ao 
for any positive integer n. 
eee 2 
S:chow thatim 2 
20 x 24 
4. Let u(x) = tan=! x. From (1 + 2?) ey + oc = (Q find a at x = 0 for 
dx? dz da” 
n = 2,3, 4, . , by the use of Leibnitz’s rule 
d” d d? 
en act (, — par (n) = -—, 2 ae eee 
5. Show that aa (e"“f(x)) = e**(D 4+ a) f(z) with D oP D i? °°? 


(D + a)” a > (*) qk Dk 
k=0 


6. Find an expression for D*[f(z) cos Az]. 

7. Prove Leibnitz’s rule by mathematical induction. 

8. If f’(x) = O fora S x S b, show that f(x) = constant fora Sz S b. 

9. Show that (10.17) can be written as f(b) — f(a) = (b — a)f’(a + 6(6 — a)), 
0<6<1. 


10.14. Functions of Two or More Variables. We say that f(z, ¥) is 
continuous at x = Xo, y = yo if for any e > O there exists a 6 > O such 
that 


f(x, y) — f(xo, yo)| < 


whenever |x — xo|? + |y — yo|? < 6. In general f(x1, 22, . .., Xn) is 
continuous at (x?, x8, ... , 2°) if for any e > O there exists a 6 > 0 
such that 

\f(t1, 2, «2 6 y Mn) — f(T}, 72,26 ~ MP) <e 


whenever > lx; — xl? < 6. 
im 
Theorems 10.20 and 10.21 can be shown to be true for a continuous 
function of n variables, n finite. 


408 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


An example of a function which is continuous in z and is continuous in 
y but is not continuous in both z and y is as follows: Define 


i@,”) = ayy ty <0 
f(0, 0) 





0 
Then lim f(z, 0) = he 0 = 0 = f(0, 0), so that f(z, y) is continuous in x 
r—-0 


at (0, 0). The same S cintaniont applies to the variable y. However, 


lim f(z, y) # f(0, 0) 


y— 0 


since lim f(z, y) does not exist. This can be seen by letting y = mz, 
2-0 


yO 
mM: 2 
m # (0, and noting that me f(z. y) = du se ee a we a Since 


a? + mx? 
a0 
m is arbitrary, lim f(x, y) does not exist. 
xr— 0 
y 0 


One defines the partial derivatives of f(z, y) at the point (20, yo) as 
follows: 











0 A a 
files; yo) = oe ony = Yim Lot A® Yo) = flo, wo) 
Ox Y= Yo on Ax (10 24) 
of (Zo, Yo + Ay) = f(xo Yo) 
Xo, Yo) = = le=n = Lim ~~ thie Ue 
NY OY ly =yo Ay—0 Ay 





provided these limits exist. Further derivatives can be computed, and 


we write 
a o°*f _ a7 _ o*f o’f 
Ju = ax? hia = Ox Oy Jar = ay Ox Sax = ay? 








Ezample 10.11. Consider 
f(z, y} = ae x* + y? * 0 




















f(0, 0) =0 
Then fi(0, 0) = 2 i tim FOO”) ” lim 5 =0 
Oa) = a oe tim LO) = y) —~f0,y) _ lim a == 0 
fa(0, 0) = 5 7-| = = lim 1S. OS OO Tim f= 0 
f2(0, 0) = a tim h) — f(O, 0) = lim 5 =0 
a 5 z0 iy = . lim Ee 
fis(0, 0) = 5% |, = Tim 2Exs O20, 0) = lim y= 





REAL-VARIABLE THEORY 409 


of of 
’ dy 0x \0,0° Ox dy |0,0 
are continuous, then fi, = far. 

Proof. Define 


U = f(x + Ar, y + Ay) — f(z, y + Ay) — f(x + Az, y) +f, y) 
P(y) = f(x + Az, y) — f(a, y) 
b(y + Ay) = f(a + Az, y + Ay) — fe, y + Ay) 
so that U = &(y + Ay) — P(y) 


We can show, however, that, if fie and fa, 








In this example 





If we apply the law of the mean to U, we have 


OP (y + 6; Ay) 


U = Ay ; 0<A< 1 
oy 
Of(x + Az, 6, Ar a 
= Ay ee sate z (x, Y + 6, av) | 


Again applying the law of the mean to the variable z (y + 6; Ay can be considered as a 
constant as far as r is concerned) yields 


O*f (x + 05 AL, Yy + 6; Ay) 0 < 6; < ] 
Ox oy 0 <6. <1 





U = Ay Ax 


By interchanging the role of x and y one easily shows that 


07f (x + 93 Az, y + 04 Ay) 0O< 6; <1 


=A 
eee Dy ay ox 0<%<1 


Equating these two values of U, dividing by Ay- Az, and finally allowing Az > 0, 


Ay — 0, we obtain 
a7f(z, y) _ O*f(x, y) 


oy ox Ox OY 


provided these partial derivatives are continuous. 
Example 10.12. Total Differentials. Let z = f(z, y) have continuous first partial 
derivatives. Now 


z+ Az = f(x + Az, y + Ay) 
Az = f(x + Az, y + Ay) — f(z, y) 
= [f@@ + Az, y + dy) — f(a, y + dy)! + [f(, y + dy) — f(z, y)I 


If we apply the law of the mean to hoth terms above, we obtain 


ga OTE TAY) hy es Bd), 


A 
7 Ox Oy 


with 0 < 6; < 1,0 < 6. <1. Under the assumption of continuity of the first partial 
derivatives we have 


Of(x + 0; Az, y + Ay) 


f(z, y) sis 
x 


Ox 0 
af(a, y + % Ay) _ Ay) , , 
Oy oy 


where e, > 0, eg > 0 as Ax > 0, Ay— 0. Hence Az becomes 


Az = fey) Az + wey Ay + €; Ax + e Ay (10.25) 


410 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
The principal part of Az is defined to be 
dz = a 7 Oe 7 2 a 


If 2 = f(z, y) = 2, dz = dz, = a = ], z = 0,and Az = dz. Similarly Ay = dy, so that 


dz = Zar + 2 dy = dz + dy (10.26) 
where dz is called the total differential of z If x and y are functions of t, then from 
(10.25) 

2 MD ae 4 BW Ot 4 6 OE + go 
If “ and oy exist, we obtain 
care 40m 


since ¢.— 0, e— 0 as Ai->0. Remember that if At—+ 0 of necessity Az — 0, 


Ay — 0. 


In the general case, 1f u = f(z!, x2, , 2"), the total differential of «1s defined as 


(10.28) 


The operator form of (10 26) is 


0 
d| | = dr= | |+dy >| | (10.29) 


where the bracket can represent any differentiable function of x and y. In particular, 





of x7 of - of 
(2) - ax \ax -) dy 
orf d s 
- ae aoe oy ee dy 
of of of 4 
os ~ Oz oy) ns ey) ay 





02f ny 
ax oy dz + 5% i 





arf arf a2f 
2 = ed hy eee eee ee ‘aimienpasiase’ 4 
and d?z = d(dz) a dz +2 br dy dx dy + ay? dy? (10.30) 
provided Bs ee Symbolically, we can write (1030) as 
ax Oy dy dx : 
Oz 0z (2) 
25 — oes ie 
d2z | de + % ay | 
In general 
n, .. | 92 dz 2 
ine = | Sede + Fay | (10.31) 


REAL-VARIABLE THEORY 411 


Example 10.13. Change of Coordinates. Let z = f(x, y), and consider the change 
of coordinates given by z = z(u,v), y = y(u, v), so that z depends implicitly on u and 
v. From (10.25) 

Az _ of(z, y) Ar, Of(x, y) Ay 


_ » Y) Aw ; ay Ay 
Au Ox a oy ao na ae 


Allowing Au to approach zero yields 


dz _ ar , af dy 

i oe du” dy du (10.32) 
Similarly, 

dz _ of dx , af ay . 

av ax av 7 dy Ov (10.33) 
Symbolically, 

al |] _ ax al J , oyal | 
ors arse - Or oy (10.34) 


where the brackets can represent any function of . and y with x and y depending on 


af of 


any variable r Of course =. and ay must, in general, be continuous. In particular 


w (az) ~ uaz (ar) * ou ay az) 
au Ou OX ou Oy \ Ox 
07z Ox 02 Oy 


Ox? du ay Ox Ou 


oO (*) = ox 0 (¥) 4: dy O x) 
Ou \Oay du dx \Oy du oy \ay 


02z Ox 072 Oy 
Ox dy Ou ay? Au 


Example 10.14. Consider z = 2(x, y), rT = rcos6,y =rsin6é@ Then 


02 Oz OX 0z OY _ a 
—-~ => s 60 : 0 
or Ox Or + ay or +5 Oy sin 


0% a (a 
art = a7 (ae) 8 2 7) +m 05, (55) 


072 OX O72 day d7z OX 072 Oy 
= cos 6 | —— — + —- — 
i E px? Or ety fam oe 2 oe 


a2 


zZ 
Ox 





022 
sin #6} + sin 6 aa oy ©O8 so +55 < gin 6 


07z 
= cos 6 | $3 cos 6 + 5 nOY 


Let the reader show that 


Oz Oz OZ 
— = —rsin 0 — os 6 — 
56 r si 6 +1 cos ay 


02z az Oz 022 . 022 
—e ee ’ eect ein a amines’ (encom: faa, Autom) n eee Cc ‘ 0 
62 r cos 6 dr r cos 8 ay 7 sin @ | ar? rsin 6+ dy az r cos : 





022 
+rcos 0] — -rain 8+ 25 <r cos 6 
Ox OY 
0°z 02z 02z 1 022 1 dz 
Th 293 = ee a pang tak See yeaa 
v= Mie Ox? a oy? dr? 72062 or Or 


412 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problems 


1. Let 2 = 2? — y2,2 = e', y = sin t. Find $ = by two methods 


aie “a 
= 2 2 = . ——y —*y7 —- 7 A oy 
2. Letu = 2x y2,0=x2+y. Find a aa, and ea Hint 

Ou Ox OY 
—— 1 = es | meee = 
OU ne Ou 2y au’ 

Ou OU 

— = — 2y 4, ete. W ; _—- = 

0 = Ap 22 - us oe hy 1s it true that ee U? 


3. Let x =rsin Ocos¢, y =rsnésng,2=rcos6 If V = Vis, y, z), show 
that 
aV , aV , aV 





~ oat ay? 2? 
“rho (+0) + 003) + 2B) 
4. Let f(xs, t2, . . . , tn) be homogeneous of degree k, that is, 
FUE thay kc ee Sa) Pig Ee oe & GBA) 


Differentiate with respect to ¢, set ¢ = 1, and prove a result due to Euler, 


n 


= ) x 


a=] 


5. Prove (10.31) by mathematical induction. 
6. If f(a, x2, . . . , 2n) is homogeneous of degree k (see Prob. 4), show by mathe- 


matical induction that 


( ( 
ke 1) ++ (k—-pt)a(mfpn My... 42,2)” 


10.15. Implicit-function Theorems 
THEOREM 10.24. Let x = x, y = yo satisfy F(z, y) = 0, and let us 


oF 
suppose that F, ~ and ~ =- are continuous in a neighborhood of x = zo, 


oF 
y = yo, say, for |x — xo| a ly — yol <UL. lf 5 < Oata = Xp, ¥ = Yo, 


there exists one and only one continuous function of the independent 
variable z, say, y = f(x), such that F(a, f(x)) = 0 with yo = f(xo). In 
other words, if x, 1s any number near Zo, there exists a unique y; such that 
f(a, y1) = 0. It is in this sense that we speak of y as a function of z. 


0 ; 
Proof. Assume ee > 0 at P(xo, yo). The same reasoning applies to 


oy 
OF (x, y) 
0 


the case of <0. From continuity of there exists a neighbor- 


oy 
hood of P(2xo, yo) such that oe > Q> 0, Q fixed. We can decrease the 


REAL-VARIABLE THEORY 413 


size of this neighborhood, if necessary, so that it is bounded by |z — ao| < 
l, ly — yol < l. In this new neighborhood of x = 20, y = yo, given by 


> 


F 





OF OF 
lz — xo] < lo, ly — yo| < lo, we have F, 50° dy continuous and 5y 


Q>0. Let R > 0 be any upper limit of in this neighborhood. 





oF 
Ox 





For any x and y satisfying |z — zo| < lo, |y — yo| < lo we have 


F(z, y) =F, y) — F(ao, yo) = F(z, y) — F(xo, y) + F(r0, y) — F(ao, yo) 
OF (a + O(a — OF (xo, Yo + O2(y — Yo)) 


(x — Xo), y) + (y io Yo) Sa a a eye ee 


es Ox oy 


Let y = yote,O << e < ly. Then 

OF (a + O:(% — xo), Yo + €) OF (x0, yo + €62) 

oS oN IS age ee 
Ox dy 

ee use To), Yo ) 4 og 


F(x, yo + €) = (X — 20) 
> (2 — 20) 


Since eQ > 0, we shall have F(z, yo + €) > Oif |2 — ao] < eQ/R. Simi- 
larly F(x, yo — €) < O for |x — ao| < e«Q/R. From Theorem 10.22 a y 
exists such that F(z, y) = 0. We have shown that, for any z satisfying 
lx — xo| < eQ/R S lo, a number y exists such that F(z, y) = 0. Wesay 
that y is a function of x and write y = f(x), so that F(x, f(x)) = 0. 

Next we show that y is a single-valued function of z. Assume that, 
for any number 2; satisfying |r — 20| < «Q/R S lo, there exist two num- 
bers y; and y2 such that F(x, yi) = 0 and F(z, ye) = 0. Applying the 
law of the mean to F(x, y2) — F(x1, y1) = O yields 


(y2 — ys) —o 


Since i > 0, of necessity ye = yi, so that y = f(z) is single-valued. To 
show that y is a continuous function of x, we note that if F(a, y;) = 0, 
F(xe, yo) = O, then 

F (x2, Y2) 7 F(x, Yo) ze F(x, Y2) oe F(x, Y1) = 0) 


0 01(%2 — oF BA ie 
(ry — ay) PELE AG) 4 (yy — y,) Avy 5 Pale =) <9 








Since i ~ 0 and = is bounded, it is obvious that y2— y; as x2 — 1. 


This is exactly the property of continuity. 

Theorem 10.24 is a special case of Theorem 10.25. 

Turorem 10.25. Let F(a, x2, ..., Xn, 2) satisfy the following 
conditions: 

1. F is continuous at the point P(a, a2, . . . , Gn, Zo). 

2. The first partial derivatives of F exist at P and are continuous. 


414 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


3. F(a, Qo, . ++, Qn, 20) = (0. 
4, ey ~ 0 at P. 
dz 
By the method used in Theorem 10.24 it can be shown that a neighbor- 
hood of Po(a;, a2, . . . , Gn) exists such that z = f(a, 22, . . . , a) and 
F(x, 22, . . . , Xn, f(t, 22, . . - » Xn)) = O with zo = f(ai, aa, . . . , Gn). 
Moreover z is a continuous function in the variables 71, 2, . . . , In. 


For the case F(z, y, z) = 0, 2 = f(a, y), F(a, b, c) = 0 we have 


F(z, y,2) —Fla,b,e) =O itz = fir, y) 
so that 


F(a, y,z) — F(a, y,z) + F(a, y, 2) — F(a, 6, z) + F(a, bz) — F(a, b,c) = 0 
Applying the law of the mean to each difference yields 

oF 6i(x2 — OF (a,b + O2(y — b 
eo) MEA MER OU) 4 Gy — yy P+ Bl = Daa 


dy 
Gs gy Oe ee) 3 
dz 
If y = b, we have 
lim 22 = 8 @ jim Ze = — OF/oe 
AriG Ot OL ee tS Oo OF /dz 
a (10.35) 
Similarly dz _ _ OF /dy 
OY OF /dz 


Example 10.15. Consider F(z, y, 2) = re? + yz — y. At the point (1, 1, 0) we 


have F(1, 1, 0) = 0, anc - = ree +y ~ Oat (1, 1,0). Thus F(a, y, 2) = 0 defines 


zasa function of z and y, 2 = f(z, y), for a neighborhood of (1, 1) such that 0 = f(1, 1). 
Oz oF , Cb ; dz e? 
To compute ay We use (10.35). Thus ap a te + y, so that a may 


Notice that we do not claim to solve for z explicitly in terms of x and y. 


THEOREM 10.26. Let Fitz, u,v) = 0, Fe(z, u,v) = 0 be two equations 
involving the variables (x, u, v). Suppose that the following conditions 
are satisfied: 

1. F (20, Ud, Vo) = 0, F2(2o, Uo, Vo) = (), 

2. F, and F, have continuous first partial derivatives for a neighbor- 
hood of (x0; uo, vo). Of necessity F; and F2 are continuous. 

3. The Jacobian of F, and F,2 relative to u and v, written 


OF, oF 

Put; ou av 
- (Eu fs) ~ (OF, OF» 
du av 


does not vanish at (x0; Uo, Uo). 


REAL-VARIABLE THEORY 415 


Under these conditions there exists one and only one system of con- 
tinuous functions 


= 93(2X) v = ¢2(2) 


such that F(x, ¢1(x), g2(z)) = 0, Fe(z, o1(x), g2(z)) = O with uo = ¢93(20), 





= 92(Zo). 
OF. OF 2 
Proof. IliJ #0, then = a0, ~ O0or — ra, #0. Without loss of generality 
we take on > X 0). Ais we note that J 1s continuous, so that J + 0 for 


. oF ; 
some neighborhood of (20; wo, vo), and, moreover, a ~ 0 for a neighbor- 


hood of (20; uo, vo). From Theorem 10.25 we can solve for » as a unique 
function of x and u, v = f(z, wu), such that F2(2, u, f(x, u)) = 0. Hence 


OF’. OF’. of -0 of _ OF 2/du Ou 


ou) «(Ov Ou du —«O’2/ av 


by considering F’, as a function of rand uw. From F(x, u, v), v = f(x, uv) 
we note that F, is a function of x and u, so that 
oF, | oF; OF 
“7. r =constant =f Ov 


OU. Veaeoneiant OU |»= constant 
_ OF, | ma oF; oF | 
= }). |x =constant >). |x=constant | ~— ap 
Ou | du u = constant oh 2/ OV 


v=constant 


(i) 
_ U, v 


7 OF 2/dv 


Of | 
xz =constant Ou 
u=constant Old wacsonutane 














OF (2, u, f(z, u)) 
Ou 

we can solve F(z, u, f(x, u)) = 0 for u = g(x), gi(x) unique. Hence 

v=f(x, u) = f(a, oi(2)) = ¢2(x). Q.E.D. Theorem 10.26 holds if we 

consider F'\(a%, 22, . . . , Ln, U, Vv) = O, Fo(ai, ve, . . . , In, U, Vv) = 

The proof proceeds in the same manner, making use of Theorem 10.25. 


Since J #¥ 0, we have ~ 0 so that from Theorem 10.24 


Example 10.16. If x = f(u, v), y = ¢(u, v), then Fi(a, y, u, v) = flu, v) — 2 = 0, 
Fi(az, y, u, v) = o(u, v) — y = 0. We can solve for u and v as functions of x and y 
provided 


oF, aFi| | af af| [az ae 
F,, ) _| du dv}  |du dv} _|du av 
uo) ~|aF, aFs|~)ae ae|~ Jay ayf[™O 080) 
Ou av Ou dv Ou Ov | 


along with the condition that the first partials of z and y be continuous. Ifz = r cos 8, 


416 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


y =rsin 0, then 


az ae 
7 (%4) - or 00 e cos 9 —rsin 6 fess 
r, 6 oy dy sin @ rcos 6 
Or 0o@6 


For r ~ 0 we can solve for r and 6 uniquely in terms of x and y 1n the neighborhood of 
any point Zo = ro COS Bo, Yo = To Sin OH. Indeed, 


r = (x2 + y2)h @ = tan? 2 at Leg 


By mathematical induction one can extend the results of Theorem 
10.26. We state Theorem 10.27 but give no proof. 


THeorEM 10.27. Let f. = f.(ti, 22, . ~~, Lm, Mi, Us, .. . , Un), 
7=1,2,...,n satisfy the following requirements: 
1. Each f,, 7 = 1, 2, ... , ” has continuous first partial derivatives 


at the point P(a;, de, . . . , Gm, b1, be, . . . , On). 
2.f, = OatP,2=1,2,...,n. 
fi, fe, a es Jn of, 
3. J | 2-2 ) = 
Uy, U2,» Un OU, 
There exists one and only one system of continuous functions, 





~ Oat P. 
































Ue = OAL Dd, do. % 4 Ve) ~=1,2,...,n 
which satisfy (2) such that ¢,(@;, de, .. .,@m) = b,7=1,2,...,n. 
Example 10.17. For the coordinate transformation 
Y. = fila, La, . . . 5 En) 221,2,...,”n 
we note that fF, = fi(ti, ro, . - , ta) — y, = 0, so that we can solve for 7, as a func- 
tion of y1, yz, . . . » Yn provided 
oF, 0 1 0 4 
= |%A| |! 6 
Ox; Oz; dz, | 
assuming that the first partials are continuous. 
For the system of linear equattons y* = atr®,2 = 1,2, ...,n, we have 
of, | 
ax | = Ia! 
so that we can solve for x+,z = 1,2, ... ,n,asafunction of y!,y?, .. . , y*, provided 


|a‘| 0 (see Sec. 1.2). 


THEOREM 10.28. Let u, = f.(t1, te, ..., mr), t= 1,2,...,%n, 
have continuous first partial derivatives at P(z{, x8, .. . , 2°). A neces- 
sary and sufficient condition that a functional relationship of the form 
F(u1, U2, . . . , Un) = 0 exist is that 


J (youn ---rte) = (0 (10.37) 


Xy, Le, e * ° 3 In 


REAL-VARIABLE THEORY 417 
Proof. First assume that F(ui, we, ... , Un) = 0. Then 


n 


oF OF Oe _ Se. 
r-) emo PTO c cigh (10.38) 


a=] 





Since 2 a a = 0 can be looked upon as a linear system of n equa- 
a “J 


a= 


Sai , OF 
tions in the n quantities, * a=1,2,...,n, with ne ~ 0 for at least 
one a of necessity, 
OUe 
a lle 








(see Sec. 1.4). Hence (10.39) has been shown to be a necessary condition 
for the existence of a functional relationship involving the u,, z = 1, 2, 


n. Why is or ~ (0 for at least one a? 








oa OUa 
Conversely, let us assume that (10.39) holds, so that 
Che OM. gg, Oh 
OX) OX» OL n 
af: as |. af 
OX, OXs OXn = 0 (10.39) 
an ahs af 
OX; O22 OXn 


For the sake of simplicity we assume that the minor of On does not 


vanish, so that 





fr ofr of 
OX, OXe OXn—-1 
Of 2 Of = Offa 
Or, Ze O%n-1| ¥ 0 (10.40) 
Ofn—1 Ofn—1 Ofn—1 
Oz, 0X2 OLn—1 
Now let 
Yy. = Uy = filti, Xe, . ~~ , Xn) 
Yo = U2 = fo(Xi, La, . « « , In) 
ub Fics ¢ dete Sa nae cas eds. gh de atm ae ae Se ee Sev eee (10.41) 
Yn = Uy = frit Ways & & Xa) 


Un = In 


418 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


From (10.40) and (10.41) we have 











a af | afi 
Ox, OX2 OXn 
Of: fe Aft 
y (Yudy s-- it) oe Ox OX2 OXn £0 
‘ V1, v2, . 7 . ; Ln ee ee ee 6 e® ee «© e« e# ge ® # 8 # 
Ofn—1 Ofn—1 at ot te Ofn—1 
Ox OX» OXn 
0 0 1 
From Theorem 10.27 we can solve for the z,,7 = 1,2, ... , 7, as func- 
tions of the y,, 7 = 1,2, ... ,n, so that 
Li = P(Y1, Yo, » - © 5 Yn) +7=1,2,...,n (10.42) 
and Un = fa(X1, Xa, - - - Xn) = Fy, Yo, - « + > Yn) 
along with uw; = Y1, Ue = Yo, . . - y Un—-1 = Yn—-1 [See (10.41)]. From 
j (eos oes pe a i= oe 8 ate) « g (ene oes ze — 0 
Yiy Y2y «© © © 9 Yn U1, U2, « 2 2 a Un Yr, Y2, 2 © eo Yn 
(see Sec. 1.2) we have 
1 0 O 0 
0 1 0 0 
0 0 0 0 |=? 
oF OF OF oF 
OY: OY2 AY OYn 
oF 
so that ay, = Qand un, = F(y1, yo, . . . » Yn-1) = Fu, Ue, . . . , Un—1)- 
Thus G(ui, v2, .. . , Un) = F(uy, v2, . ~~ , Un-1) — Un = 0. G is the 


functional relationship predicted by the theorem. Q.E.D. 


Example 10.18. Consideru=2z2+y+2z,09=2—y-—2,w = 2x. Then 





Ce ae ae es 
i (22) - Loh etn 
Ee AS 1. “0 


Tt 1s obvious that G(u, v, w) =u+v—-w =0. 
Problems 


1. Let z = r sin @ cos ¢, y = 7 sin @ sin y, z = 7 cos 6. Show that 


7 (222) sri 
r, 0,¢ 


2. In Prob. 1 solve 7, 6, g in terms of z, y, z. 


REAL-VARIABLE THEORY 419 


8. Let u = zyz,v = ary + yz t+ez,w=2+y+z. Show that 


J (2=-“) = (© — y)(y — z)(2 — 2) 


4. For J =~ Qin Prob. 3 solve for oo Hint: Assuming that we can solve for x, y, z 


in terms of u, v, w since J ~ 0, we have, upon implicit differentiation with respect to u, 


ou Oy Oz 

me Gy + nS ay 

Ov 

my OW TOG, S +r + 2) Su + (y + no 
ou ~ ou Ou Ou 


Ox 
Solve this linear system for oF 
OU 


§& Letusatyt2zv=aytyztear,u =r? + y? + 2%. Show that 


u,v, W 
? (* Y :) 7 
Find a functional relationship between u, », w. 
6. Consider a function of (x1, x2, x3, . ,t"), say, f(z}, v2, .. . , 2"). We define 
the Hessian of F by 
= arf 
~ | Ox" Ox! 





Under the coordinate transformation r¢ = ag? the function f becomes F(y!, y?, 
. , y*). Show that 


. sor. 
oy’ oy? 





ne A = |a}| 





7 aaa ox 


10.16. The Riemann eee Let us consider any bounded function 
f(x) defined over the range a S x Sb. f(x) need not be continuous, 
but we desire |f(z)| < A. We subdivide the range (a, b) into n arbi- 
trary subdivisions, 


C= 2). Dy fem POO a ee, Sas SO Se, Ee 
and form the sum 


S, = » M,(a, — 2-1) (10.43) 


1=1 
where M, is the supremum of f(x) for the interval 1_, S x S$ 2, The 
sum S, obviously will depend on the manner in which we subdivide the 


interval (a, b). The infemum of all such sums obtained in this manner 
is called the upper Darboux integral of f(x), written 


- Li (x) dex (10.44) 


420 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Since S, 2 —A(b — a), the infemum S exists for all bounded functions. 
Similarly, we can form the sum 


sn = > m,(x, — 24-1) (10.45) 


1=1 


with m, the infemum of f(x) for the interval 7%: S$ 2S %1. The 
supremum of all such sums obtained in this manner is called the lower 
Darboux integral of f(x), written 


S = f ” (x) dx (10.46) 


The reader can easily verify that S < S. 
If S = §S, we say that f(z) is Riemann-integrable (R-integrable) and 
we write 


i f ” #2) dx (10.47) 


We now investigate some conditions for which f(z) is R-integrable. 
1. Let f(x) = constant = c. Then 


n 


Sn = > C(% — G1) =e y (t%, — %-1) = c(b — a) 


1=1 t=] 


S, = c(b — a) 


so that S = S = c(b — a) = [fp ede. 
2. Let f(z) = «forO0 Sx <1. We consider the subdivision 0 < 1/n 
<2/n<:++ <i/fn< +++ <1. Then 


_VWafi_t-1\)_ 1 VY. amty_1f, 1 
s,- ) 2(2 n }-4yi- 2n? -3(.+2) 
1 1=1 


= 





The infemum of all sums obtained by equally spaced divisions is seen to 
be ;. Similarly 
nr 


7—-lft+ it-tl 1 1 1 
mm DGS) Dd @- 9-5-2) 
L 1 


= = 





The supremum of all such sumsiss;. Let the reader prove that S=S= 5 
for all manners of subdivisions (not necessarily equally spaced), so that 


I ‘dz = z 
3. A continuous function 1s always R-integrable for the finite range (a, b). 


REAL-VARIABLE THEORY 421 
Proof. Since f(x) is continuous, it is uniformly continuous for 
asxsb 


Hence for any e > 0 there exists a finite subdivision of a $ x S b such 
that the difference between the supremum and infemum of f(z) on any 
interval of the subdivision is less than «/(b — a). This yields 


€ 


b—a 





ae 


|S, — Sal < (b — a) 


If necessary we subdivide further so that |S — S,| <«, |S — s,:| <. 
Why can this be done? One must recall the definitions and properties 
of the supremum and infemum. This is left for the reader. Hence 
|S — S| < 3e. Since e€ can be chosen arbitrarily small, of necessity 
S—S= QO, S= 8. Q.E.D. It must be remembered that S and § are 
fixed numbers. 

4. Any bounded monotonic increasing or decreasing function is ?2-inte- 
grable (see Prob. 10). 

5. An example of a function which is not /-integrable is as follows: 
f(x) = 1 for x irrational, f(x) = 0 for x rational, OS 2S 1. Let the 
reader show that S = 1, S = 0. 

We list some properties of the Riemann integral. It is assumed that 
the integrals under discussion exist. 


(1) [? of(a) de =e [ f(x) az 

(2) f° f@) dx = — f° fe) ae 

(3) f° s@) dx = f° fr) de + f° F@) de 

(4) If f(z) 2 0, then [ f(z) de 2 O for b 2 a. 

(5) [? Uf@) + e@l dx = f° fz) de + [° o(@) de 
6) [? \s@lax = [" f@) dz, b 2a 

() | [P e@y@ de] s [7 @) de f° Po) de 
Proof. 


[/ De@) + I@P dx =»? [” o%(@) de 
+ 2r fr v(x) f(x) dx + [ f7(xz) dx 2 0 


for b>a. If y = AX? +2BA+C 2 0 for all real values of \, then 
Ad? + 2BX + C = 0 has either two equal real roots or no real roots. 


422 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Of necessity B? — 4AC s 0, which yields 
I fe ¢(x)f(x) dx |" S iC g*(x) de [ P(x) dx (10.48) 


(8) [ f(x) dex < M\b — al, M = supremum of f(x) fora S$ x S b 


(9) The First Theorem of the Mean for Integrals. If f(x) 1s continuous 
on the range a S$ x S bd, then 





[fade =-af) asesd 


The proof of (9) is left as an exercise for the reader (see Prob. 8). 


(10) [ * f(x) dr = 0 
We should like to emphasize that, if f(r) 1s R-integrable on the range 
(a, 6), we can write 


P= [Ppydz = [poate 
since the value of the integral does not depend on the variable which is 
used to describe f(x). Of course 7 depends on the upper and lower limits, 


b and a, respectively. Moreover J does depend on the form of f(z). 
Once f(x) is chosen, however, we need not use x as the variable of inte- 


gration. If f(x) is R-integrable for the range (a, b), then [ * f(t) dt exists 
fora <2 <b. Each value of x determines a value of [ * f(t) dt, so that 


a single-valued function, F(x), can be defined by 
F(x) = f * £(t) dé (10.49) 


We could write F(r) = [ “ f(x) dx, but confusion is avoided if we use 
(10.49). From (10.49) we have 


r+ Ag 


For + Av) = f f(t) dt 


and 


F(x + Ar) — F(z) = [ f(t) dt — [ * F(t) dt 
= [roast [PP soa = [7 feo dt (10.50) 


x+Ax 


Since f(x) is bounded for a S$ x S b, we have 
IF(c + Ax) — F(x)| $ A Ar (10.51) 


Hence F(z) is continuous since F(z + Ax) — F(x) as Ax > 0. 


REAL-VARIABLE THEORY 423 
From (10.50) we note that 
Fe + Ar) — Fr) =f(f)Ae x«S&Sur+dz 
[see (9) above], provided f(z) is continuous on (2, « + Ax). Hence 


dF(x) _ ), Fe + Ax) — F(e) _ 
dx 


F'(a) = am ge) ee) 


Az—0 


Ar 
£ [ f(t) dt = f(x) (10.52) 


We can obtain (10.52) without recourse to the law of the mean. We 
have 


F(x a Ax) — F(z) _ 1 xz+Az 1 "2+Ax 
Re + Ae) — F@) _ gay = 2 | jy at — Af feat 


] zt+Az 

-2 uo - rena 

if f(t) is continuous at t = 2, we have |f(t) — f(x)| < ¢ for |f — 2| < 6. 
Choose |t — z| < 6 so that 


7 a Ff r+Ar 
P(x + Ax) — P(e) _ f(x) | < a [ : it ep 





Ar 
Since « can be chosen arbitrarily small, we note that 


F(a + Ax) — F(a) _ 


AG F(a) 


F'(z2) = Fst 


We obtain the fundamental theorem of the integral calculus as follows: 
Let G(x) be any function whose derivative is f(x), f(x) continuous. Then 
G'(x) = F’(x) fora S$ x2 Sb. From Prob. 8, Sec. 10.13, we note that 
F(z) = G(x) + C, C = constant. Hence 


Gz) +C = [* at 
For x = a we have G(a) + C = 0s0 that 


G(x) — G(a) = [ * F(t) dt 
For x = b we obtain 
G(b) — G(a) = [ ° f(x) dx (10.53) 
Hence, to evaluate [ : f(x) dx, we find any function G(z) whose deriva- 


tive is f(x); the difference between G(b) and G(a) yields i. : f(a) dz. 


424 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Problems 


nr n 
1. From (k + 1)? — k? = 2k +1 we have > (kK +1)? —k?] =n+2 > k, so 
k=1 k=} 


that > k= 5l(n +1)? —1—n] = n(n + 1)/2. Consider 
k=} 
(k +1)? — k3 = 3k? + 3k 451 


and show that > 2 = an(n + 1)(2n + 1) 


k=] 
2. Without recourse to the fundamental theorem of the integral calculus show that 


1 
I 2? dx = 3 by making use of the result of Prob. 1 


3. If f(z) is continuous for a S$ x Sb, b > a, and if [750 dt = 0 for all x on 
a 
(a, b), show that f(x) = 0. 
4. If f(x) is continuous on (a, 6), b > a, and if, for any continuous function ¢(z), 
b 
i f(r)e(r2) dx = 0, show that f(z) = 0 on (a, b). 
a 


6. If f(x) is continuous fora S$ x S b,b >a, and if f(x) 2 0 on (a, dD), 


Hl 


b 
f(r) dx = 0 


show that f(z) = 0 on (a, b). 
6. Let f(z) = 1 for 0 < x S 4, f(z) = 2 for <x ¥€ 1, and consider 


FQ) = ie fit) dt, 


Q0O<s2s1. Obtain a graph of F(x). Why does F’(z) not exist atz = x? Does this 
contradict (10.52)? ‘ 

7. Consider the interval 0 S$ « $1. Define f(z) as follows: f(x) = 0 if z is irra- 
tional, f(x) = 1/q if x = p/q, p and q integers in lowest form. Show that f(x) is 
R-integrable on 0 S 7 S11. - 

8. Prove the first theorem of the mean for integrals. If f(z) is continuous on (a, b), 
g(x) R-integrable on (a, b), and g(x) of the same sign on (a, b), then 


b b 
f fiorele) de =f [Pole dr  asesd 


9. If f’(x) and ¢’(z) are R-integrable, show that 





b b b 
[tee az = seer)! - [P ewy'@ ae 
= f)e() = fadela) — f° os" ax (10.54) 


Equation (10.54) represents an integration by parts. 
10. Prove that any bounded monotonic function is R-integrable. 


REAL-VARIABLE THEORY 425 


10.17. Integrals with a Parameter. Let us consider the integral 
= - "f(a, thde t<tSt (10.55) 


After f(z, £) is integrated with respect to z, what remains depends on the 
parameter t, fo S ¢ S t;. Of course F(¢) depends also on a and b, but for 
the present we assume that these numbers are fixed and finite. For 
example, 
1 
dx 1 ] 
— ——————- = 1 
F(t) | ets ; tan t> 0 
is a well-defined function of ¢ for t > 0. 
We now determine some conditions for which F(é) will be continuous 
We desire 
lim F(t) = F@ = [ ” ¢(x, 2) dx (10.56) 


{ft 


ip 2 : , t) 


exists and 1s a bounded AR-mtegrable function with respect to 


x for he <¢t St, then 


] 


b 
F(t) — F()| = | f(z, t) — f(a, | de 
ae 
ine. a) | He | <M\t—1| (10.57) 


by applying the law of the mean to f(z, t) — f(z, t). From (10.57) we see 
that lim F(t) = F(t). A less stringent condition which does not imply 


: ae 


the existence of and which enables us to prove the continuity of 


F(t) is as follows: — f(z, #) be uniformly continuous in ¢ with respect to z 
so that for any « > 0 there exists a 6 > 0, 6 independent of z, such that 
\f(x, t) — f(x, t)| < ¢ whenever |t — t]| < 6. One notes immediately that 


IF®) — F@| s [? ldz| = eld - al 


Since e can be chosen arbitrarily small, we see that F(t) > F(t) as t= f. 
Of more particular interest is the possibility of showing that 


7." ea: - | MED an (10.58) 


Let us assume that ve! is continuous fora S$zSb,tb $t St. We 


426 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


define G(t) by 
b 
G(t) = [ We Dg 


t t fb b ft ary, 
Then [owa= fi PLE aca = f° [LE dae 
to tl Ja ot a Jto ot 
b 


= [ [f(z, t) a F(z, to) dx 
= F(t) — F(to) (10.59) 





The interchange of the order of integration can be justified since the 
integrand is continuous. From (10.52) we obtain 


dF(t) da [° _ an . [’af(z, 2) 
at -2 [sande = an = [| ME2 d 


Example 10.19. We consider 





F(t) = fr ef cone iy ji. << 7 (10.60) 


It is obvious that - “2 (etm) = cos refs + ig continuous in s and ¢ for 0 S$ z Sa, 


\t] < 7’, so that 
dF - feos r 
a 7 [i cos re dc 
It is easy to justify a further differentiation so that 


d2h 


rv 
do i cos? x ef 8 * dz 


= F(t) — i sin? x ef oe? dr 


= F(t) + [ gin xd eters) 
_1 
t 





= F(t) + —— ane eteosz |” fe cos x ef 8 = dx 
z=0 
— Ft) — 1dr 
=r) t dt 
B d?w 1 dw 
essel’s differential equation (see Sec. 5.13) for n = 0 is —~ ain eas +w=0. If 


z = it, we have ie + ; “ — w = 0, which 1s satisfied by F(t) of (10.60). 
Problems 
1. Show that 
fatn-slyer al 
o (x? +t)" di*Lv/ v/t 


b : b 
2. From [ cos tx dx = (1/t)(sin bt — sin at) compute | x*P cos tz dz, p a posi- 
a 
tive integer. 


REAL-VARIABLE THEORY 427 


8. Show that 


[; ~ sin x dx _2 0O<asl 
0 VYi—2acosztzt+a? 2 esd 
a 
¥(é) 
4. Given F(t) = » 1% t) dt, show that 





dF (t ¥(t) d 
“ao ip FED ae + HO py, 9 — 2 proto, 0 


for certain restrictions on —— oe) e(t), w(t). 


6. By two methods show a 


ad [? 
di | (2c + 1) dx = 403 + 312 — At 


10.18. Improper and Infinite Integrals. The function f(x) = 
xz > 0, f(0) = 1, is well defined on the interval0O Sz <1. Since f(z) is 
not bounded on 0 S$ x S 1, we are not certain that it is ?-integrable on 
this range. However, for0 <¢« S$ x S$ 1, we have 


1 
Fle) = nav 
As e— 0, F(e) — 2 and we define 
1 dx ; Voda 


—. = lim — = 2 10.61 
0 Vx 5 € Va i” 


We notice that f(0) could have been defined in any way we please with- 


1 
out affecting the value of the improper integral I WA Let the reader 
0 X 


show that the improper integral [ « fails to exist. 
A Practical Rule. Let f(x) be R-integrable for the rangea +e S28 b, 
for alle > 0. Define ¥(x) by 
v(x) = (x — a)*f(z) 
1. If for nw < 1 we have |¥(x)| < M = constant for a S$ x S b, then 
lim i = f(x) dx exists. 


«>0 
2. If for » = 1 we have |y(x)| > m > Ofora Sz S BD, then 


am ee F(@) us 


e>0 


fails to exist. 


428 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
These results are left as an exercise for the reader. One easily sees 


b 
that lim a. exists if u < 1 and fails to exist if » = 1. 
a ate (x pa a)# 


If f(z) is continuous fora S x S bexcept atz =c,a <c < b, we define 


b ; c~—e ‘ b v4 
i f(z) dx = lim f f(x) dx + - [ _J(e)de (10.62) 
6> 


] 
i 
e>0 


provided these limits exist. It is important that e« and 6 approach zero 
independently. 


Example 1020. Weconsider uae The upper limit may give us trouble. 


a 

0 /q? — 2 
However, f(r) = (a — 7) ta +.r)}, so that v(x) = (a - t)*f (xr) = (a+ x)3 is 
bounded forO S$ 2 Za. Since yp = 5 the integral exists and, actually, 





The integral | i f(r) dx is called an infinite integral and is defined as 
follows: Let f(z) be R-integrable fora < x S$ X, X arbitrary. We define 


I ° f(x) de = lim ; * f(x) de (10.63) 
a X—> 0 a 


provided this limit exists. For example, 


a * _ de —1 T 
I a= lim [ ge ee X=5 


Let the reader verify that I ” cos x dx does not exist. If i lf(x)| dx 


exists, we say that f(z) is absolutely integrable. 
From (10.63) we note that for any e > 0 an Xo exists such that 





| f° Sa) dx - [PI@ az <e  forX = Xp 
or | ie f(x) dx <e forX =X (10.64) 
One sees immediately that if ” |f(x)| dx exists then f ” f(x) dz exists, 


since 


fe 1@ az] s fF @laz 


A simple test for the convergence or existence of - f(x) dx is as 


follows: If a function ¢(zx) exists such that [ ” |p(x)| dx converges, and 
a 


REAL-VARIABLE THEORY 429 


if |g(x)| > |f(x)| for « 2 a, then [ ” f(z) dx converges. We leave the 


proof of this statement as a simple exercise for the reader. 
Another simple test for the convergence of an infinite integral is based 
upon the integration-by-parts formula. From 


| * ula) do(x) = u(X)o(X) — u(a)o(a) — i * n(x) du(x) 
we note that, if lim u(X)r(X) exists and if if v(7) du(x) exists, then 
X— © a 


i 7 u(x) dv(x) exists. 


’ © gin x 
Example 10.21. We consider i = dx. The origin need not concern us since 


sin x 
lim —— = 1. We have 
z30 ¢ 
X gin x xX J cos X X COS z 
i oe ax = | =O C08 2) Se ae tg. AE 
r/2 oo x/22 Xx x/2 2 
: cos X X (Cos x odr ; 
Now lm —=— = Oand lm | a dx exists since [ —> exists. Since 
Sg at X00 Jxr/2 2 n/2 2 


r/2 gin x 
I ao as 
0 r 


© sin x r 
dx exists Its value is 9° 





exists, we see that, I, 


We now consider an infinite integral involving a parameter. Let ¢(é) 
be defined by 


ey = fr fiz,)dr t<t<t (10.65) 


We assume that the integral of (10.65) exists for each value of ¢ on the 
range fo St St). If we fix our attention on a specific value of t, we have 


for any e > 0 an Xo such that 
| I f(z, t) da Ze Se (10.66) 


The value of Xo (for a fixed «) may well depend on ¢. If, for any e > 0, 
there exists an Xo independent of ¢ such that (10.66) holds for all ¢ on the 
range tf) S ¢ S ti, we say that the infinite integral converges uniformly 
with respect to é. 

Let the reader show that if g(x) 2 |f(z, | for tf S ¢ S t;, and if 


i} ” g(x) dx exists, then [ ” f(z, t) dx converges uniformly. 


Example 10.22. We consider F(t) = if " e-#* cos zt dz. Since e~=* 2 |e~** cos zt! 


for all ¢ and since [ ; edz = +/xn/2 exists, F(t) exists for all ¢. 


430 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


We now show that if f(z, ¢) is uniformly continuous in ¢tfora S$ x S X, 
X finite but arbitrary, then ¢(t) of (10.65) is continuous in ¢ provided 


i; ° f(a, t) dx converges uniformly. We have 
o(t + At) — o) = [” Lila, t+ ad) — fw, 0) de 
= [* Ufa, t+ ad — f@, Olde 
+ ie f(a, t + At) dx — Ie f(a, t) dx 
From uniform convergence we have for any ¢/3 > 0 an X such that 


€ 


if f(x, é + At) dx L f(a, t) dx <5 


From Sec. 10.17 we note that for any «/3 > 0a 6 > 0 exists such that 


<5 











€ 


Sig 





x oe 
f [f(x, t + At) — f(a, t)] dx 


e e x e ° ° 
for |At| < 6. This is the property that [ f(x, t) dx is continuous in 4, 
a 


X finite. Hence 
le(t + At) — g(t)| <e 


for |At| < 6, 6 > 0, so that g(t) is continuous. Q.E.D. 
It is interesting to find a sufficient condition which enables one to write 





dp(t) _ _ [° of(z, t) 
“aH of f(a, t) dx -{ a) dx (10.67) 
Let the reader first show that, if G(¢) = | aha dx converges uni- 
a 
formly, then 
‘ (° af(z,?) _ | ° | ‘ af(e, t) 
[I a, da dt = . 9 ge ae 
t 
| G(t) dé = o(t) — (to) (10.68) 


provided —-~— af <, Je is continuous for x 2 a, fo S ¢ S t, [see (10.59)]. Dif- 
ferentiation . (10.68) leads to (10.67). 
Example 10.23. We consider F(é) given in Example 10.22. From 


f(a, t) = e-** cos rt 


we have a = —2zre~** gin 2xt. To show that [ : WG by dz converges uniformly for 


REAL-VARIABLE THEORY 431 
I we) t) dz | < i 2re-** dx = 1. Hence 


we = I . 2re~** gin 2xrt dx 
t 0 


all ¢, we note that 





a Lr sin Qxt de~=* = — 2 fr ez? cos 2.x dit 
= —2F (t) 


Integration vields F(t) = Ae? = ie et? cog Qrtdxr Att = 0 we have 


A= I e dr = Vr 


2 

so that _ 
I e-*? cos 22t dx = Ma e~? 
0 2 
Problems 
1 — att? 
1. Show that I See dt exists forO Sa < 1 
1/k ax 





2. Show that exists 


Wi = ~ 3 | — k2y2) 
n/2 
8. Show that I In sin z dz exists 


4. Show that 7/2 — tant = I (1/r)e—* sin a dz, lt = 0 


© cos br 5‘ a 
6. Show that if eae converges uniformly for allt. I’valuate this integral 


1 + yr? 
by contour integration. 
wo eo 
6. Show that I; zsin et dx = [ re~*d(— cos e*) exists. 


7. Let f(z, t) = zt. Show that 


= 1 1 20 
i; [i 1@. t) ade ff, f(x, t) dx at 


© agin2 
8. Consider F(y) = i: aria dx, and show that F(y) = ; y 





10.19. Methods of Integration. We first consider the special indefinite 
integral 
dx 
| ax Ge) (10.69) 
In the elementary calculus the reader was told to expand (xr? + 1)7! 
(c — 2)—! in partial fractions, obtaining 


1 nee C 
(x? +1)(2 —2) 2w?4+1 x —2 
_ (A+ COC)” + (B- 2A) + CU — 2B 


CE aeD (10.70) 


432 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Since (10.70) holds for all xz, one has A+C=0, B— 2A =9Q, 
C — 2B = 1, so that A = —}, B = —}, C =. The integral (10.69) 
can be written 


[Ahr-tfat lft 
(a? +1 eT® A e+ti 5 f/ 2+ t—2 


In (a? + 1) — Stan“ 2 + in (ic) 


-5 
+ constant 


The results of Sec. 8.11 justify the expansion (10.70). Since r+? + 1 and 
x — 2 are relatively prime (there is no polynomial of degree 21 which 
divides both x? + 1 and x — 2), there exist two polynomials P(x), Q(x) 
such that 


P(x)(x? + 1) + Q(z)(e — 2) = 1 ina Pe at = 2 


Thus P(z) = C, Q(z) = Az + B, and (10.70) results upon division by 
(x? + 1)(x — 2). To evaluate the integral 




















ae 
(7? + 1)(% — 2) 
one notes that 
sc ae he ig 
(@?+1)(2—-2) £S52?4+1 52?4+1'° 52x-2 
~_if(,_ 1 )j)_?# #_41 2 
7 (1 r+ | 5a tatty) 
so that 
d 
| ecg ct terte-din@t + tine 240 


If the zeros of a polynomial P(x) are known, one can evaluate 

















Q(z) 
| P(x) dx (10.71) 
provided Q(x) is a polynomial. One writes 
1 mn a1 Q2 . ° . e e e an 
es t— Th tee 
im “Bey a k = 1, 2, , n. One then divides Q(x) by 


x — r, and performs a series of simple integrations fork = 1,2, ... ,n. 


REAL-VARIABLE THEORY 433 


If P(x) is a polynomial such that x = r is a kth-fold zero of P(x) 
we have 


P(x) = (x — r)/FQ(x)  — Q(r) #0 
Thus P(x) = («@ — rr) NkQ(x) + (xe — r)Q"(x)] 


and P’(r) = 0 provided k > 1. Fork = 1, P’(r) # 0, so that P(x) and 
P’(x) have no zeros in common if P(x) has no multiple zeros. In this 
latter case P(x) and P’(x) are relatively prime so that polynomials A (x) 
and B(x) exist such that 


A(x)P(x) + B(x)P'(z) = 1 (10.72) 


To evaluate an integral of the type 


C(x) 
(P(x) ax (10.73) 


provided P(x) has no multiple zeros, we make use of (10.72). Thus 











C@) _ A@C@) , B@P'@CO) 
[P(x)]" — [P(x) Jr [P(x)]* 
and 
CC) i A(x)CQ@) | i = B(x)C(x) 
[P(x)]" er — m [P(x)}"" 
pt, [BOLE ABO vary 


Since m has been reduced by 1, we can repeat this process until integrals 
of the type (10.71) occur. 


Example 10.24. We consider 


_ f= dt _ fz dt_ se (t/2) 2 
inte) = J, apie | oF De fo ois” 


since 1- (#2 +1) — (t/2)(2t) =1. Integration by parts yields 








t¢@ + 1)1-* |* 5 fe dt 
Ini) Te) 5g | ae Jy Gea 
an — ZL 
=e eg Ina) + Qin — 1)(z? + 1) (10.75) 


From J;(z) = tan-! z one has I2(z) = 5 tan7! z + 32/(z? +1). Repeated applica- 
tion of (10.75) yields 


-3-5 +--+ (Qn — 3) 
2-4-6 - 


—@n — 2) tan“! 2+ R(x) 


I,(z) = 


where F(z) is a rational function of xr. 


434 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


If F(x) is a polynomial, it can be written in the canonical form 


Ea) =X yX2 =? Xe 
with X,, Xo, ..., Xm, polynomials relatively prime to each other. 
Since X, is relatively prime to X2 - - + X™, we have 
A(a)X, + B(x) X2 +--+ Xm =1 
B(x) A(x) _ l 
so that O™*X XE XE Wy 


Continuing this process by noting that X3 is relatively prime to X3- - - 
X™, one can write 





F(z) _ Ar(z) , Aco(e) , .. . | Am(%) 
F(a) XX, 7 XP T° Tks 
if f(z) is also a polynomial. The evaluation of 
F(z) 
F(x) ue 


is reduced to an evaluation of integrals of the type given by (10.73), 
which in turn can be reduced to the simpler form given by (10.71). 
Let R(z, y) be a rational function of x and y, 


Ayjyx'y? 
R(x, y) = Se (10.76) 
6, ;2y? 
To evaluate eee 
[R(a, WV Ax + B) dx (10.77) 


with y = +/Az + B, one notes that the change of variable #2? = Az + B, 
= (2tdt)/A, x = (# — B)/A, reduces (10.77) to an integral of fhe 
type discussed above. 


Example 10.25. To evaluate 


J Vv ett a (10.78) 


we let #2 = x + 1, 2¢dt = dz, so that (10.78) becomes 


2t? dt 
gai? {(i+e25) 4 
os f[r+i( --1)]aemsmict +c 
2\t¢-1 j%¢€+1 t+ 1 


and [ et ae avid i+mVZtl—li¢ 
Ve+1+1 











REAL-VARIABLE THEORY 435 


We can evaluate 
fR(x, Vax? + bx + ¢) de (10.79) 


provided FR is a rational function. Let xz = a, y = B satisfy 
yy = Avr + Bet+C 


so that 62 = Aa? + Ba + C. The equation of the straight line through 
the point (a, 8) is y — 8 = t(x — a) with ¢ the slope. The intersection 
of this straight line with the curve y? = Ax? + Br + C is easily seen to 
yield the coordinates 
B+ Aa — 28t + at? 
eae oe 
t?? — A 

(B + 2Aa — 2£6t)t 
2 — A 


(10.80) 
=p+ 





Since x, y, and dz are rational functions of ¢, one can integrate (10.79) 
by the methods discussed above. 


Example 10.26. We evaluate i; Oe ee: 
aVr? + p? 


B =0, C = p*, and we choose a = 0 s0 that B = p. From (10 80) 





We have y? = 72+ p?, A = 1, 


_ 2p _ 1+? _ 2p(1 + #) 
T= 7 Pp Y= Pop dr = oy pye 


so that. | pee = } [4 ane In = Pon Vint + y + (' 
aVaer + pe? Pp l p x 
The integral (10.79) is a special case of the following type of integral. 
Suppose R(x, y) is a rational function of x and y, with y depending 
implicitly on x through the relation f(z, y) = 0. If the curve f(z, y) = 0 


is unicursal in the sense that we can describe the curve parametrically by 
x = g(t), y = ¥(¢) with g(t) and Y(t) rational functions of ¢, then 


JR(z, y) dx = JR(ed), ¥O)e'@ ae (10.81) 


and this integral can be evaluated by the methods discussed above. 
A rational function of the trigonometric functions can be integrated 
by the use of the following change of variable: 


ee oe fe oe 
x = 2tan-'t 2 1 + cos £ 


1-? 
de = cost = Ta (10.82) 


sing = 


436 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


For example, 
dx 2/(1 + #*) e == z +C 
| ao | 2/7 + P) BP) dé =Int+Ce=l1n (tan 5] 


The elliptic integral of the second kind appears in a natural manner 
if one attempts to find the arc length of an ellipse. Let z = a sin ¢, 
y = bcos ¢, 0 S ¢ < 2m be the parametric representation of an ellipse. 
Then ds? = dx? + dy? = (a? cos? g + b? sin? ¢) dg’, so that 


Bit. spree ae Site 
L= a |, 4/1 — e? sin? gd 
with e? = (a? — b?)/a? < lif b? < a*. The elliptic integral of the second 
kind is defined as 
E.(k, y) = i J/1—ksintgde  |kl <1 (10.83) 


The period of a simple pendulum leads to the elliptic integral of the first 
kind, given by 
¢ dy 
EK k ¢) = ete ee eee 
Mh 0 V1 — k? sin? ¢ 
Extensive tables can be found in the literature for the evaluation of these 
important elliptic integrals. 


ikl <1 (10.84) 





Problems 
Integrate the following: 


adr 
° (ax + b)? 
2 - aE 
: x*(ax + b)? 
dx 
e: | x(x — 1)3 


x dz 


© Ge ee ew) 
6. | ere ee 
(eee La 

zx 


oa 


7. [ SE frre > G,0 <0 
xr Vax? + br +c 


dx 
ee canes ta nai 2 2 2 2 
| reset > 0 0 


9 i a 
; asin cz + b cos cx 
10. Let @ and 8 be real roots of Az? + Br + C =0. Show that 





rI—-a 
2p 





VA + Be $C = (e - 8) VA 


REAL-VARIABLE THEORY 437 





and that t = VA ~ — ; defines x as a rational function of ¢. Apply this result to 


evaluate { v2? — 32 + 2dz. 
w/2 
sin™ zdzx. Show that 








11. Let J, = 
= ™ = I Ie, = (2p — "Nx _ (2p) "4 
ve egy es 2p (2p) 2 “tl "(2p + DI! 
with (2p)'"' = 2 4 6 - (2p), Qn+ 1)! =1-3-5- + (2p 41). 


10.20. Sequences and Series. A sequence of constant terms 
Sis 856 Sau woke -oSe & (10.85) 


is simply a set of numbers which can be put into one-to-one corre- 
spondence with the positive integers. To completely determine the 
sequence, we must be given the specific rule for determining the nth 
term, n = 1, 2,3, .... We may also look upon the sequence, {sn}, 
of (10.85) as a function defined only over the integers, so that f(r) = Sp, 
N= 1,2, 3, ..6 8 « 
DEFINITION 10.26. The sequence of terms {s,} is said to converge if a 
number S exists such that 
lim s, =S8S (10.86) 
ate 
Equation (10.86) states that for each e > 0 there exists an integer N(e) 
such that |S — s,| < eforn 2 N(e). In general, the smaller ¢ is chosen, 
the greater is the corresponding N. The existence of an integer N for 
the given e imphes that the sequence becomes and remains within an e 
distance of S. 


Example 10.27. It1is easy to show that the sequence 2, ls, 15, -..-y,l+1/n,... 
converges to the hmit 1. Wehave |s, — 1| = |1/n| < eifn > 1/e,so that N(e) = [1/el, 
where [1/e] 1s the first integer greater than 1 /e. 


In most cases we cannot readily determine the limit of the sequence 
even though the sequence converges. Cauchy obtained a criterion for 
the convergence of a sequence which does not depend on knowing the 
limit of the sequence. 

Cauchy’s Criterion. If, for any « > 0, an integer N exists such that 
Snap — Sn| < efor n = N and all p 2 0, the sequence {s,} converges to 
a unique limit S, 

This criterion is certainly necessary, for if lim s, = S, then |S — s,]| 


n> @ 


< «/2forn 2 N, |S — Sn4p| < «/2 forn 2 N, p 2 O, so that |sarp — 8, 
<eforn 2 N,p 20. Conversely, assume that a given sequence satis- 
fies Cauchy’s criterion. Choose e = 1, so that |si4» — Sa] < lforn 2 Mi, 


p20. Hence |sy,4» — sy,| < 1 for p 2 O, and all terms in the sequence 





438 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


following sy, are within a unit distance of sy,. There are only a finite 
number of terms preceding sy,, so that the sequence must be bounded. 
From the Weierstrass-Bolzano theorem at least one limit point exists for 
the infinite set of numbers {s,}. Let us assume that two limit points 
exist, say, S and 7, S < T. We choose ec = (7 — S)/2 and note that 
lSntp — Snx| < efor n 2 N, p20. This means that after sy the terms 
of the sequence are clustered within an e distance of each other. Let the 
reader conclude that, if S is a limit point of the set {s,}, then 7 cannot 
be a limit point, and conversely. Thus Cauchy’s criterion guarantees 
that the sequence has a unique limit, S, with lim s, = S. It should be 


n— © 


understood that by first considering « = 1 we showed that the sequence 
was bounded. Then we deduced that only one limit point could exist. 

Example 10.28. If we return to the sequence of Example 24, we note that 

sig al on Pia ee 
ifn > 1/e, p = O, since p/(p +n) <1. Again N = [1/el. 

By applying the Weierstrass-Bolzano theorem it is very easy to prove 
that a bounded monotonic nondecreasing sequence of real numbers always 
converges. As an example of the use of this result, we consider the 
sequence 


w= (1 +2) n=1,2,3,... 
n 


We show first that the sequence is bounded. We have from Newton’s 
binomial expansion 


dees (*) ps ee pts ee 


i 


Sn 








21 on? 31 — 7 
iti pi oie, Gee. 4 
! : n 
so that 
he ea A cc we a a te a oi ca 
ic 2! ° 3! 2° 2? 23 


Every term of the sequence is less than 3. Next we show that s, > s,_1, 
so that the sequence is monotonic increasing. First let us notice that, 
if a and 8 are any two numbers such that a > 6 2 0, then from 


a® — B" = (a — B)(a! + a8 + + + + + a"? + Br) 


we obtain 


na"—'(a — B) > a® — B* > nB"—"(a — B) 


REAL-VARIABLE THEORY 439 
Now leta = 1+ 1/(n—1) > 8 =1+1/n, 7 > 1, 8o that 


ae B" > a® — na*"(a — B) = a® [a — n(a — B)] 
yields 


= (143) > Sn—-1 ee ee = Sn—1 n>] 
n n—] n—- 1 


Thus the sequence converges to a unique limit, called e. 


ii (: +3) =~ e = 2.71898 -- - 


n—> © 


The infinite series 


Utuete ss tutoe: (10.87) 


with u, well defined for n = 1,2, . . . can be given aclear meaning if we 
reduce the series to a sequence. We define s; = wy, 8S = u; + U2, ... , 


a > w n=1,2,3,... (10.88) 
t=] 
If the sequence {s,} converges to a unique limit S, we say that the series 


» u, converges to S and write 


+=] 


cS > Uy (10.89) 


om] 


~ 


If the sequence fails to converge, we say that the series diverges. In 
general, one of three things will occur when we consider the convergence 
of a series: 

1. The series converges. 

2. The series increases beyond bound, and so diverges properly. 

3. The series oscillates and does not converge. 


Example 10.29. The series > m}, m < 1, converges to the limit 1/(1 — m) since 


3=0 
. 1 
8, = US SSUES US TSB ay as n— ©. The series 1 +141 
j=0 
+... +1+ .-.- - increases beyond bound since s, = n. Theseries1 —1+1 — 
1+---+ +(—1)"t! + --- does not converge since s, = (1 + (—1)**!] fails to 


converge. The terms of the sequence oscillate between 1 and zero. 
nr 


If we apply the Cauchy criterion to the sequence s, = > uj; n = 1, 
j=l 
2, ... , we note that a necessary and sufficient condition for the con- 


440 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
eo 

vergence of the series > u, is that for any e > 0 an integer N exists such 
g=1 


that 
n+p 


Uy 
gent! 


<e€ (10.90) 





n+p n 
forn 2 N, all p > O, since sayp = > U;, Sn = » u, Formula (10.90) 


7=) z= ] 
is equivalent to the statement that for any e > 0 an integer N exists 


a 


such that | > Uy 
gent) 


<e€ forn = WN 





If we write 


n 


Yu ut Re 


j=1 p= 


with hk, = u,, we note that the series converges if for any e > 0 
gant 

we can find an integer N such that all remainders, Ra, are less than e 

forn = N. 

If we apply (10.90) for a convergent series with p = 1, we note that 
for any « > 0 an integer N exists such that |u,4:| <«forn 2 N. Thus 
a necessary condition that a series converge Is that the nth term of the 
series tends to zero as n becomes infinite. However, the harmonic series 


a 


ae 
» i diverges even though the nth term tends to zero as n becomes 





n == |} 
a 


infinite. That » : diverges is seen by writing 


n=] 


] 
epee eee eee Gar a) 
gg Ae ae) org OO 
Se eg eS eg 
We now consider some practical tests to determine whether a series of 
positwe terms converges or not. 


1. Since u, 2 0, we note that the sequence s, = > Qe KD. 
g=1 
3, ... ,18 monotonic nondecreasing. If a constant, 4/7, exists such that 


REAL-VARIABLE THEORY 441 


Sy = > u, < M for all n, the sequence is bounded and hence converges. 
gel 


fo a} 


: ] 
Thus the series y Fl converges since 
=1 


1 1] ] 
w= a<itgteee: = 2 


for all n. 
«© oO 
: 
2. Ifv, = u, = Oand > ’, converges, then > u, converges. This is 
n=] n=1 
the comparison test of Weierstrass. The proof is trivial and is left as an 


3 





exercise for the reader. Conversely, if u, 2 v, 2 0, and if > v, diverges, 
n=] 
[eo] io} 
; l 
then u,» diverges. The series oat < 1, diverges, since 1/n* > 1/n 
n=] n=1 
forn = 1, 2,. 


3. The Integral Test. Let o(x) 2 0 be a monotonic nonincreasing 
function defined for x = 1. Let the reader show that 


PAO) + e-g@e4)= [" e@) dr 
= (2) + 9/3) + +--+ + ¢(n) (10.91) 


From (10.91) the reader can deduce that > y(n) converges or diverges 


n=1 


according to whether i: g(x) dx exists or not. 


, ] , Lyx 
The series —-, a > 1, 18 seen to converge since g(x) = — is mono- 
« na ) ye 
n=] 
nm 
; dx 1 
tonic decreasing and lim — = fora > 1. 


4. D’ Alembert's Ratio Test. Suppose that a constant d exists such that 


3 
forn 2 N we have ajii/a, <d. Ifd <1, a, converges. From ayn41 
n=] 


< dan, On42 < dany; < d’an, ..., Gnyp < day, ... we note that 


oo oo 


> Qn < an > d’ = an/(1 — d). Thus > a, is bounded and so con- 
r=( 


n=N n=] 


442 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
3 


verges. If a,4:/a. > d 2 1 for n 2 N, the reader can show that > On 


nw} 


diverges. At times it may be useful to consider 


lim “2t! = d (10.92) 


n-—> © an 


If d < 1, then a,,;/a, < d’ < 1 for n 2 N. Why? Thus the series 
converges. If d > 1, the series can be shown to diverge. The case 
d = | jis indeterminate since examples can be found for which both con- 


vergence and divergence exist. 
5. The Cauchy nth-root Test. Assume that lim ~/a, =k <1. Then 


nm~-—> CO 


for n = N we have ~/a, < k’ < 1,80 that a, < (k’)"forn = N. Since 


(k’)" converges, of necessity > a, converges. The reader should 


n= N nw) 


show that, if lim ~/a, = k > 1, then » a, diverges. The series » = 


n—> 0 
n == | n=] 


n 1 ] 
converges since ,/— = ,, < I. 
ye 


6. Raabe’s Test. We consider the series, ) An, an > 0. If 
n=) 


an 
Qn+1 





=1+ + f(n) (10.93) 


with lim nf(n) = 0, then > a, converges or diverges according asa > 1 
nr-~> @ pai 1 


org <1. From (10.93) we have 


lim ee — (n+ » | =g— ] 


n— @ n+l 


Assume o > 1 sothateo —1=k>0. Forn 2 N we have 


An k; 
or = [na — (n + 1)danyi] > Gna (10.94) 
Let n=N, N+1, N+2,...,N+p-—1, in (10.94), and add. 


We obtain 


REAL-VARIABLE THEORY 443 
N+p-! 
2 2 
Anti < k [Nan << (N + P)an+p| < k Nan 


n=N 
N+p-1 
Since (2/k)Nay is a fixed number, the series ? Qn+1 is bounded for 
nah 


2 


all p. Hence > a, is bounded and so converges. If « < 1, then for 


nz] 
n2=WN we have day1 > — ee Thus 
N N N 
On+41 > WZ WW, ans > Frag Oven > ag am a4 
N 
aN+p > Wace pom 5 rs 
so that 


oo 


] 
> an+p > Nan » Np = o£ 
p= 


p=] 
and > d, diverges. It can also be shown that ) a, diverges ifo = 1. 
nem] n=} 


2 


l 
As an example we consider » 2 We have 











An (n+1)? _ ne+2n4+l _ 2 
n 


On41 n? n? 


] : 
Since lim n- 73 0 and o = 2 > 1, the series converges. The series 


n> © 
oo 


Lg : 
. diverges since ¢ = |. 
n=] 
Of particular interest are those series with alternating positive and 
negative terms. We consider 


> (—1)"*!a, = ay — do + G3 — Ag + e 8 + (—1)"*H1a, + os 8 
n=] 

(10.95) 
In order that ) (—1)"+1a, converge, of necessity, a, ~ 0 as n— o. 


n=] 


Let us assume further that a,,; S a,forn = 1,2, .... Weshow that 


444 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
the series given by (10.95) converges. First let us note that 


Son — (a; = 2) + (a3 ar Qs) + oe + (Aon—1 ~~ Gon) Ze 0 


= Aa; — (a2 7 a3) aed (4 a as) Bae FR (Aon—2 qe Aon—1) — Aon s Qa 


Hence the sequence Se, Su, ... , Son, - .. 18 monotonic nondecreas- 
ing and bounded, and thus converges to a unique limit S. Moreover 
Song = Sen + Gen4i1, SO that lim Se,,, = S since lim de,41 = 0 by 


n— 0 n> © 
assumption. Thus the alternating series converges to a unique limit 
S20. 


If y (—1)"t1a, converges but x a, diverges, we say that the series is 


n=] 
co 


conditionally convergent. — If ) a, converges, we say that the alternating 
n ==] 


series 1s absolutely convergent. 


Problems 
1. Apply the integral test to the series 2K anh? and show that the series con- 
k=2 
verges or diverges according asa > l ora <1 
1 1 1 
2. Show that the series 1 — 5 + rar +e oe + (yen? + ++ + converges. 


0 


8. Show that a necessary and sufficient condition for the convergence of > Qn is 
n=] 
that for any e > 0 an integer N exists such that |@n41 + Qng2 > + Gmyil < e for 
m2zn2wN 
4. Show that, for |r} <1, hm r™ = 0.) Hont: |r|n*) = jr} |r|" < frim 


n— © 


3 


5. Show that lim Ne = 0 so that y = converges. 
Ly n! 


n> © 
= n=1 


6. Show that the series 


1 +% aB , a(a + 188 +1) , aa + Ila + 2)8(6 + NG + 2) 
y(y + 1)2! —yv(y + Dy + 2)8! 


converges if y > a + 8 and diverges otherwise. 


o 


1 ; 
7. Show that the series > —;>- diverges. 


n n 
n=l] 


c “(2n — 1)!!4n +37? 
8. Consider the convergence of . eave on Lt 5 | ith 
1 = 


(2n)!! = 2-4-6-+-+- (2n) 
(Qn —- 1)! =1-3-5- +--+ (Qn — 1). 


REAL-VARIABLE THEORY 445 


9. Assume ad, 2 Qny1 > O for n = 1, 2, 3, . . . , and further assume that » an 


nel 
eo 


converges. Show that lim na, = 0. Why does > ~ diverge? 


n—> 


ns | 
« @ n nr 
10. Let > Sn converge to S, » t, converge to 7’. Defines, = @) ss) (> t,); 
n=] nz] a=] gm] 
and show that lm o, = ST’. 


n— 
00 





—1)7t1 , ee ; 
11. Show that the series > G is conditionally convergent. Is it possible 
n=! 
to rearrange the terms of this sequence so that the resulting sequence would converge 
to x or any other number? 


12. Consider the sequence ki, kz, . . . kn, . . . , defined by kiy1 = 2 Wk,/1 + ky, 
O<ki < 1,27 =1,2,.... Showthatkhy > kandO <k, < 1lfor2 =1,2,.... 
Then prove that lim k, = 1. 

1— 0 
13. Consider the sequence of complex numbers, a, + bi, n = 1, 2,3, .... If 


the sequence {a,} converges to a and the sequence {b,} converges to 6, show that the 
sequence {d, + b,2} converges to a + bz in the sense that for any e > O an integer 
N(e) exists such that 


l(a + bt) — (Qn + bnt)| < forn 2 N 


14. Show that the convergence of a sequence can be made to depend on the con- 
vergence of a series. Hunt: 


Sn = 81 + (82 — 81) + (83 — Se) + + + (82 — 8n—1) 


10.21. Sequences and Series with Variable Terms. Let us considera 
sequence of functions 


fila), folx), . . fal), ... (10.96) 


with fr(z), n = 1, 2, ... , defined on the rangea SxS b. For any 
number zx = c of the interval (a, 6) we can investigate the convergence 
of the sequence of constant terms 


file), falc), RE Bo fr(e), a 8 (10.97) 
If the sequence of (10.97) converges, we can write 


lim f,(c) = A; (10.98) 


n— © 


where A, is a constant which obviously depends on the number z = ec. 
If the sequence of (10.96) converges for all « on the interval (a, 6), we 
obtain a set of numbers {A,} which defines a function of z on (a, b), 


446 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
for to each x there corresponds a unique A,. We write 


lim f,(@) = f(z) aszxzsb (10.99) 


n> @ 


with f(x) = Ag. 


Ezample 10.30. We consider the sequence {fi(r)} with f(x) = 2/(1 + nz), 
n=1,2,...,0S281. Ata = Owehavef,(0) = 0,n = 1,2,3,... ,80 that 


lm f,(0) = 0. For z #0 it is obvious that lim f,(r) = lim = (0. The 


Geet 2 
N-—> 0 n—- © n— 0 1 + NX 
limiting function for this example is f(z) = 0,0 S 2 S 1. If we consider the sequence 
{fn(z)} with fa(c) = 1/C1 + nr), n = 4, 2, 3, .,0 Se S 1, we note that 


lim f.(0) = lim 1 =1 


n> n—-> © 


whereas, for zx ~ 0, lim f,(c) = 0. The limiting function in this latter case is 
n—> 0 


defined by f(z) = Ofor0 < + S$ 1,f(0) = 1 Thuis function is discontinuous at 1 = 0 


Let the reader graph a few terms of this sequence. 

Example 10.31. Wet us see how rapidly the sequence {x/(1 + nz)} converges to 
f(z) =0,0 SzS1. Atz =0 the convergence occurs immediately since f,(0) = 0 
forn = 1,2,3,.... Fora # 0 we have 


Ife) — fl =, ia rae (10.100) 


« > Oand arbitrary, provided (1 + nx)/r < l/eorn > 1/e — 1/r Thusforn > 1/e 
(10.100) will hold since |/e > 1/e — 1/r If Nis the first integer greater than 1/e, we 
shall have |f,(z) — f(z)| < efor n 2 N It is important to note that an integer N 
can be found which is independent of « for 0 Sr S$ 1 We say that the sequence 
converges uniformly to its limiting value f(z) = 0 


DEFINITION 10.27. The sequence { f,(c)} defined fora S$ x S bis said 
to converge uniformly to f(x) if for any e > O aninteger N exists such that 


lfn(x) — f(x)| <« (10.101) 


holds for all z on (a, b) provided n 2 N. 

Uniform convergence is essentially the following: If one imagines that 
a circular tube of arbitrary radius e« > 0 surrounds f(z) throughout the 
length of the definition of f(x), then uniform convergence guarantees that 
an integer N can be found such that, for n 2 N, f(x) will also lie inside 
the tube for all x on (a, 6). In other words, all terms of the sequence 
from the Nth term onward lie inside the e tube throughout the length of 
the tube. The value of N, in general, depends on e. Since the curves 
f(z), m = 1, 2, 3, ... , are two-dimensional, the tube need only be 
two-dimensional. It is not necessary that f(z), and hence the tube, be 
continuous. 


Example 10.32. Let us consider the sequence {1 — x*},n = 1,2,3, ... , defined 
for —; $x <4. From Prob. 4, Sec. 10.20, we have that lim z* = 0 if |z| < $. 


n— ® 


REAL-VARIABLLE THEORY 447 


Thus f(z) = hm f,(z) = lim (1 — 2") = 1 for —% sas 5 We wish to deter- 
n—> 0 n— © 
mine whether the convergence is uniform. We have |fn(x) — f(z)| = |z|" < ($)", 
since |z| < g 1s our range of definition. Hence we can make |f,(z) — f(x)| < if we 
choose n sufficiently large so that (g)" <e. This can be done if we choose n > 
In e/(in 2 — In 3) forO <e <1. Ife 21, wecanchoosen 21 Thus the sequence 
{fn(z)} converges uniformly to f(z) on the range a Sis 5 
Example 10.33. We consider the sequence {nze~"*} defined for 0 Sz S$ 1. Let 
the reader show that. 
f(x) = lim nre~™* = 0 Os2rsl 


n> © 


We show, however, that the sequence does not converge uniformly to f(z). First we 
note that 


IIA 


\fa(z) — f(z)| = nze™= OO 1 


A 


x 


If the convergence were uniform, then for e« = 0.01 we would be able to find an integer 
N such that nre~"* < 0.01 for n 2 N and for all cg on OS x S$ 1. In particular 
Nxe~N? would be less than 0.01 for all x on (0, 1). If we choose x = 1/N, we have 
N - (1/N)e"N:°G/") = e-! > 0.01, a contradiction. Hence the sequence fails to con- 
verge uniformly to f(z) on the range 0 S x <1. 


The following theorems will emphasize the importance of uniform 
convergence: 

THEOREM 10.29. Let {f,(7)} be a sequence of continuous functions 
defined fora Sx Sb. If the sequence converges uniformly to f(x) on 
(a, b), then f(x) is continuous on (a, b). 

The proof is simple. Let x = c be any point of (a, b), and choose any 
e>0O. From 


fle +h) — fl) =([fle +h) — fale + h)] 
+ [fale + h) — fale)] + (fale) — f(e)] 


we have 


Ife +h) — fl S lfle +h) — fale + h)| 
+ |fale +h) — frle)| + |f.(c) — fle)| (10.102) 


From uniform convergence we note that an imteger N exists such that 
lfn(z) — f(x)| < «/3forn 2 N and for allzon (a,b). Applying this result 
to (10.102) yields 


If(e +h) — f(o)| < ge + |fw(e + h) — frr(c)| 


Since fy(z) is continuous at x = c, we have |fy(c + h) — fv(c)| < €/3 for 
|h| < 6,6 > 0. Henceforanye > 0a5 > 0 exists such that |f(e + h) — 
f(c)| < efor |h| < 6. Q.E.D. 

Example 10.33 shows that f(z) can be continuous even though the con- 
vergence is not uniform. Uniform convergence is a sufficient condition 
for continuity to occur, but is not necessary. 


448 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


THroreM 10.30. Let {f,(2)} be a sequence of continuous functions 
defined fora <x Sb. If the sequence converges uniformly to f(z), then 


[s@) dz = iE lim fn(x) dx = lim JP fol) dx (10.103) 


The proof of (10.103) proceeds as follows: Define &,(z) by 
f(x) = fa(x) + R, (2) (10.104) 


Since f(z) is continuous from the previous theorem, /,(x) is also continu- 
ous, for it is the difference of two continuous functions. From (10.104) 


we have 


[1@) dx = [P fo) dx + fr R,(x) dx 

b b Pe dee | 
or | [ f(x) dx — f fa(z) de} = | : R(x) de 
From uniform convergence we note that for any e > 0, and hence for 
«/(b — a) > 0, an integer N exists such that |Ra(x)| < «/(b — a) for 
n 2 N and for all xz on (a, 6). Thus 

| " f(x) de — JP fol) dz < i IRy(x)| dr < 

for n 2 N. This means that the sequence I ° fr(z) dx converges to 
i ; f(x) dz, so that (10.103) results. Q.E.D. It should be emphasized 
that Theorem 10.30 was shown to be true provided a and 0 are finite. 


Example 10.34. We consider the sequence of Example 10.31 We have 


Pte salience, Yee ae d 

Bie i ttn” tim | |, (i - ce a) e 

ia [1 - Ind +n a a, 
n 


n> wo 7 


-~ 


1 
Moreover f(7) = 080 that I f(x) dx = 0, which checks the results of Theorem 10.30 


since the convergence was seen to be uniform. 
Ezample 10.35. Let the reader show that the sequence {nze~"="} does not converge 
uniformly on the range 0 S$ z S 1 but does converge to f(z) = 0. Now 





1 el 
—nz? dr = — 1 nz? et — e7n 
J nxe x xe ‘ g(1 — e-*) 
1 1 
so that lim nae"? dr = . x I f(z) dz =0 
n— 0 0 


On the other hand, the sequence {nx/(1 + n?xz?)} does not converge uniformly on the 
range 0 S$ z S 1 but does converge to f(x) = 0, and 


: 1 nz aT In(l +n?) _,_ ff! 
Tim fy papa de = lim SEP) 2 = [ f(x) de 





REAL-VARIABLE THEORY 449 


Uniform convergence is a sufficient condition to apply (10.103), but it is not a neces- 
sary condition. 


THEOREM 10.31. Let {f,(z)} be a sequence defined overa SxS b 
which is known to converge to the constant f(c) att =c,aScsb. 
Assume further that the sequence {f/(x)} converges uniformly to g(z) 
on (a, b) with f/(x) continuous on (a, b) forn = 1, 2, .... We show 
that the sequence {fn(x)} converges uniformly to a function f(r) such 
that f’(z) = g(x) fora S$ x S b, that is, 


lim fe) = f'@) = $ Lim $40) (10.105) 


Since the sequence {f’(x)} converges uniformly to g(7) we have from 
Theorem 10.30 that 


[7 9(@) ae = tim [* f(e) az = lim Tf,(@) — f.(0] (10.106) 
From (10.106) the reader can readily deduce that 


lim fale) = [* g(x) de + flo) 


n> © 
Thus lim f,(z) = f(x) exists, and since g(x) is continuous, we have 
n-—> 0 


f'(xz) = g(x). From f’(x) = fi(z) + R(x) with |R,(2)| < « for n 2 N 
let the reader show that the sequence {f,(2)} converges to f(x) uniformly. 
If f.(z), n = 1, 2, 3, . . . is defined for a S$ x S b, we say that the 


eo 


series > fr(z) converges at x = cif the series of constant terms ) fn(c) 


nm] n=] 
co 


converges,a Sc Sb. We write f(c) = > fr(e), where 


n=} 


f(c) = hm file) 
eee iz 


If > f.(2) converges for all x on (a, b), we write 


n=1 


f@) = Y fale) = lim > fila) (10.107) 
ee Fel 


n=] 


It is convenient to express f(z) as a finite series plus a remainder which 


450 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


obviously contains an infinite number of terms, 


fa) = filz) + Rae) (10.108) 
k=l 


28 


with R,(xz) = f(x). 


k=n+1 


DEFINITION 10.28. The series > fn(@) is said to converge uniformly to 
n=] 


f(x) on (a, b) if for any e« > O an integer N exists such that 
IR»(x)| < forn = N, axersb 


In terms of the Cauchy criterion we note that uniform convergence exists 
if for any « > O an integer N exists such that 


lSn¢-p(X) — Sn(x)| < € (10.109) 


for n 2 N, all p > O, and for all x on (a, b), with s,(z) = > f(x). 
k= 


The results concerning continuity, integration, and differentiation of 
uniformly convergent sequences apply equally well to any uniformly con- 
vergent series. To prove these results, one need only change the series 
into a sequence [see (10.88)]. 

Before constructing a few tests for determining the uniform conver- 
gence of a series we prove a result due to Abel. Let wi, wo, . . . , Un 
be a monotonic nonincreasing sequence of positive terms, u,.1 S u,, 
uj > 0. Let ai, a2, . . . , dn be any set of numbers, with MM and m the 
maximum and minimum, respectively, of the set 


3 


Qi, @) + 2,0, + a2 +a3,..., > a, 
a=] 
We show that - 
n 
mu; < > aju; < Mu, (10.110) 
gr 
n 
Proof. Let s; = ai, 8 = a; +d, ..., 8: = > a;, 80 that a; = 381, 
1=1 
dg = S2: — 8}, a3 = 83 — So, . . . , An = Sa — Sn-1. We write 
nr n 


7=1 i=1 
S1Ui1 + (82 — 81)Ue + (S83 — Se)Us +--+ + + (Sp — Sp—1)Un 
Si(y — Ue) + Se(e — Us) ++ > 4+ Spii(Un-a — Un) + Sntn 


REAL-VARIABLE THEORY 451 


Since uj4i1 S u,for7j = 1,2, ...,n— 1, and since u, > 0, we note that 


n 


> aju, S M(u, — ue) + M(ue — us) + + +: + Mu, = Mu, 
jp=l 

y Aju, = M(uUy — U2) + M(ue — U3) + °°: + mu, = muy 

j =1 


which proves (10.110). 


A. Abel’s Test for Uniform Convergence. > G,nU,(x) converges uni- 
n=1 


formly on (a, b) if 


1. » Qn cCONVerges. 


n=1 
2. un(x) > O and u(x) 2 Ungi(z),a Saxsb,n 
3. |ui(x)| < k for all z on (a, b). 


IV 


N+p 
Proof. Since » a, converges, an integer N exists such that 2 An 
n=N+1 





na] 


< ¢/k for all p > 0. From (10.110), Abel’s lemma, we have 


! N+p | 
PY anuale) | < fursit@) $ Sule) <. 
n=N+1 





forallp >0. Q.E.D. 
B. The Weterstrass M Test. Assume |u,(x)| S M, = constant, for 


oO 


n= 1, 2, 8, 04~5 @S27 56. Li » M, converges, then > Un(a) 

n=l n=} 
converges uniformly on (a, 6). Proof. For any « > 0 an integer N 
exists such that 


N+ p 
M, <e for all p > 0 
n=N+] 
N+p N+p N+p 
But | > un(x) | y lu» (x)| < M.<e QED. 
nu N+] n=N+1 n=N+1 


C. > Un(x)v,(x2) is uniformly convergent on (a, b) if: 


neo] 


452 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


1. | > u,(x) | < M = constant, for all n and for all x on (a, b). 


a=] 


2. : lvngi(2) — va(x)| is uniformly convergent on (a, b). 
n=l 

3. Un(z) — 0 uniformly asn— o. 

Proof. Let s(x) = y u,(x) |s,.(2)| < M from (1) 


a=] 
n 


Sx(z) = Y w(z)or(z) 
am] 
Now 
Danie a Sa = Un4+1Un+1 + Un42Un+2 + eae + Unt+pVn+p 
= (Snp1 — Sn)Une1 + °° * H+ (Snip — Sntp—i1)Untp 
= —S8nVn41 + Snt1(Un41 ao Un+2) fe ete Sn4pUnt+p 
n+p-1 
\Sntp — Sal S [Sal [engal + > Se] [Or — Urgal + [Sato] lontl 
r=n+1 
n+p—1 
< M| Ira zs n+nl - > |, = Pel 
r=n+1 
Since v,(z) — 0 uniformly as n—> ©, we have |vnsi| < €/3.M, |iniy| < 
n+p-—l1 
e/3M forn 2 N,. From (2), » \» — Vpsil < €/3M forn = Ne. For 
r=n+1 


N equal to the larger of Ni, N2 we have 
[Snap — Sal < e€ n= N,allp>0 Q.E.D. 


Example 10.36. We consider 
g(z) = 1-* — 2-7 +375 —4-2 4... 4+ (-1 met... 


ow 


forO <is2rZR. Wehave g(r) = > AnUn(Z) with 


n=l 


<2 





(1) a, = (-1)""} | S a, 


a=] 


+1 
(2) Jona — val = |(n + 1)7* — n77| = 2 i: t—(z+)) at| z>0 





=z : aye t-@+) dt < gn7 6+) 
n 


0 


fs] 0 
> lUneg1 — Unl <x > n-O+) 2 R > n~ Ot) 
n=1 


n=] n=l 


REAL-VARIABLE THEORY 453 
oo 


eo 
Since n+) converges, the Weierstrass M test states that > lvno1 — Val con- 
n=} n= |} 


verges uniformly for0 <5 S28 R. 
(3) n= Sn >> 0 so that n-* — 0 uniformly 
From (C), ¢(z) converges uniformly for0 <6 Sa R. 


An important type of series is the power series 


P(z) = ) aga” (10.111) 


n=@Q 


If we apply the ratio test, we note that the series of (10.111) converges if 











n+1 
lim | [at 1 
n— 0 Anx” 
or Iz] < lim |" |=R (10.112) 
n>0 | On+1 








We call FR the radius of convergence of the series. The word ‘“‘radius”’ 
comes into play if we consider the complex series P(z) = » G32"; 
n = () 


z = x + ty, which converges for |z| < R. If P(x) converges for |z| < R, 


it follows that the series Q(x) = > la,|z” also converges for |x| < R 
n=0 


[see (10.112)]. The ratio test fails to tell us anything concerning the end 


points z = —R,x = R. The convergence of > a,l” and » an(—R)" 
n=O n=0 


must be examined by the methods of Sec. 10.20 or by other means. 


Example 10.37. (1) > nix" converges only for « = 0. (2) > x” converges 
n=0 n= 


for -l <2 <1. (3) > ~ converges for all finite zx. 
n=O 


If P(x) = > a,x” converges for |z| < R, it is very easy to show that 


n=0 
eo 


a,x" converges uniformly for |z| S$ Ri < R. We have for |z| S R, 


n=0 


454 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


that |a,7"| S |a,|R%, n = 1, 2,3, ... , and since > la,|R" converges, 
n=(0 
the Weierstrass M test yields uniform convergence. Since each term of 
the series is continuous, P(x) is continuous inside its region of conver- 
gence. Let the reader show that P’(x) has the same radius of conver- 
gence as P(x). Why is it that P’(x) = > NAy,L"—1? 
nm} 

A result due to Abel concerning power series can be stated as follows: 

If > G@, converges to s, then >» a,x” is uniformly convergent for 


n=@Q n=Q(0 
3 


0 


IA 


x <1, and lim > GQ,x” = 8. The proof depends on the result 
z—1 
n=) 
given by (10.110). Since » a, converges, we know that for any e > 0 
n=0 
an integer N exists such that ja, + dni: + °° * + Qnip| < eforn 2 N, 
ally 20. For 0 S$ x S$ 1 we know that z*, n = 0,1, 2,3, ...,i8a 
monotonic nonincreasing sequence, so that 


n+p 


| > a,v*| S ex" Se 





1=n7 


[see (10.110)]. Hence the series is uniformly convergent for 0 Sz S 1 


so that P(x) = > a,x" is continuous on the intervalO < zx S$ 1. Hence 


n=(0 
ce] 


lim P(x) = P(1) = > a, = s. For example, it is known that 
zr 1 


n=? -* 
bat nt+lyn 
in (1+ 2) =) aera 
nN 
n=l 


oo 


an n+l 
for |z| < 1. Since > SS converges, we have In 2 = > 


ns] n=] 
3 


(—1)+1 
n 





The converse of Abel’s theorem is not true. If P(x) = » Ont” Con- 


n=Q 
0 


verges to s as x > 1, we cannot say that > a, converges to s. An 
n=] 


REAL-VARIABLE THEORY 455 


eo 


example due to Tauber is as follows: P(x) (—1)"2z" = 1/(1 + 2), 
n= () 


and lim P(x) = 3. However, 
at n = ( 


(— 1)" is not convergent. 
Problems 


1. Show that 1/(1 + 2?) = Y (—1)"t2" converges uniformly for 0 StS 2 <1 





n=Q@0 
Integrate, and show that 
zr n+1 
=] = ons n 
tan7) y (a 
n=(0) 
. 1 
wieys » (—) 2n +1 
n=0 
9. If hm w/a, =7 exists, show that P(z) a,(2 — 20)" converges for 
noe, n = () 
lz — 2o| < 1/r. 
oo 
Jun (z)| 


uy (x) is said to be absolutely convergent fora S x S bif 
n= () 


8. The series > 

n= : 

converges fora Sz Sb. Show that S(z) = : al - is absolutely convergent 
for |z| S 1 and that S(z) is not uniformly eee for |z| 

Bi Mare 2 is uniformly convergent for 7/4 S$ x S 3nr/4 


4. Show that > on ti 
n=0Q 
By considering x = 2/2 show that the series is not absolutely convergent for +/4 


x & 3n/4. 
5. Show that 


1 Ing . Ee c 1 3? 
, dz = > i xr Inzgdzr= — > (n+1)) age 
n=0 n=0 


L-—2z 





@ 


. , 7 
6. Prove that the series D i+ nizt 


n=l 
2 
7. Prove that i io ee ae 
l-aza 4 
Un(%) converge uniformly for a $ x < 6b, and assume further 


is uniformly convergent for all z. 


8 Let Siz) = 


n=] 


456 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


that lim u,(7) = L, for all n, un(z) not necessarily continuous. If > L, con- 
w—c nw 


verges, show that lim S(z) = >, Le 


zc 


n=} 
9. Let am, 2 Oform = 1,2,3,...,n = 1, 2,3, ... , and suppose a constant 
K exists such that 
N M 
Omn < K 
n=l] m=] 
for all M, N. Show that > > Qmn exists and that 
n=1m=1 
Qmn = > > Amn (10.113) 
n=1 m=] m=l1n=1 


Write the elements {a,,,} a8 a square array, and interpret (10.113). 


3 


10. If P(z) = > asad 


n=(0 


Piz +w) = > a,(z + w)" 


nz=Q 
2 n 
= > > an (") wre" 
r 


n=0 r=0 


2 n 
converges absolutely, that is, > > lan| (7) |w|"|z|"-" converges, show that 


n=0 r=0 
Pz+uw) = > > (* 7 ‘) Oey 72°w" (10.114) 
es r=Qs=Q 


Apply (10.114) to E(z) = > =", and show that E(e + w) = E(2)E(w). 


n=Q 
ae 


11. A series Un(z) is said to be “‘boundedly convergent’ for the interval 


n=] 
2 


asx Ss bif > Un(x) converges for all z on (a, b) and if a constant M exists such that 


| y Un (x) 


n=l 
(a, b) except for the point c if the series converges uniformly on the intervalsa $ z S$ 
c-6,c+65 S28}, however small 5 may be, 6 >0. Show that if a series is 


n=l 


< M for all z on (a, b). The series is said to be uniformly continuous on 





REAL-VARIABLE THEORY 457 


uniformly convergent on a S$ x S b except for a finite number of points and if the 
series is also boundedly convergent then the series may be integrated term by term. 
12. For what ranges of x do the following series converge uniformly? 


xz? +n? x*n? 
1 1 


n= n= 
18. Second Law of the Mean for Integrals (Bonnet). Consider 


n—-1 


b 
[P te@e@ ae = Y fenyoled enn — 2) 


j=0 


Let ¢(x) be positive and monotonic nonincreasing on (a, b), and consider the set of 
numbers 


So = f(a)(a — a) 
8S; = f(a) (x1 — a) + f(x1) (a2 — 41) 


n—1l1 
S.= ) fe den — 2) f ” $(a) dx 
7=0 


Let A and B be the minimum and maximum values of the set S,,72 = 0,1,2, ...,n, 
apply Abel’s result of (10.110), and show that 


Aeta) s [° fe)olz) de 5 Bela) 
If f(z) is continuous, show that 
b E 
[Ptee0@ az = o@) [*faar asses 
Consider ¢(z) — ¢(b) ¥ 0, ¢(x) monotonic nondecreasing and positive, and show that 
b g b 
[P1@e@) dz = 00) [7@) dz + 0) ["F@ ae (10.115) 


10.22. Dini’s Conditions. A simple example shows that (10.103) does 
not necessarily hold if one of the limits of integration is infinite. Let us 
consider the sequence {f,(x)} with f,(7) = (22/n*)e—*”"", n = 1, 2, 3, 

.,withz 20. Itiseasy tosee that lim f,(z) = f(z) = Oforz 2 0. 


By setting f/(x) = 0 one easily shows that the maximum value of f,(z) 
for x = 0 occurs at x = n/~/2 80 that |fp(x)| S VW/2/e (1/n) < 1/n < 
forn = N =1+[1/e]. Hence the sequence converges uniformly to its 
limit f(z) = 0 for z 2 0. On the other hand, 

lim | f,(z)de = lim | 2 edz = lim 1 
0 nao Jo 0 


r— © n> © 


I 
put 


I 
o 


“| tim fal) dz = | 0+ dx 
0 n>” 0 


458 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


We now state and prove a result due to Dini. We can write 


I. a Un(x) dx = > [ Un(x) dx (10.116) 


n=1 


if the following conditions are fulfilled: 
1. u,(z) 18 continuous for xz 2a,n = 1,2,3,... 
2. lim va(z) = vn(©) exists for n = 1, 2,3, ... , with 
t—> 0 


v(t) = - Un(t) dt 
3. > u,(x) converges uniformly fora S$ x S X, X arbitrary but finite. 


nw] 
eo 


4, » v,(x) converges uniformly for x 2 a. 
n=] 


Proof. From (3) and (2) we can write 


ao 


i x Un(x) dx = y fr Un(z) dz = » Un(X) 


n=l nol 
Hence 


f, : : U(x) dz = lim if x > Un(x) dx = im x Un(X) 


To prove (10.116), we must show that lim » r(X) = > »,( 0), since 
X— 0 


n=} n=) 


> Vn(%) = > [ . Un(x) dz. Itis first necessary to show that ») Un( 2) 
nm} nel n=] 


exists by making use of (2) and (4). From (4) an integer N can be found 


such that ~ 
2 N 
>, v,(X) — >, 00) <j for all X 
n=} n nm 


= 


N 
From (2), lim o4(X) = o9(c) form = 1,2,3, .. . ,.N, 80 that ’ [vn(X) 
——> OO n=l 
— Un(0)] < ¢/4 for X 2 Xo, since the limit of a finite sum is the sum of 
the limits. 
The identity 


N 


y Un(X) — > Vn(~) = » Vn(X) — y n(X) | + y [Un(X) — vn( > )] 
n=] 1 


n= n=] n= n=} 


REAL-VARIABLE THEORY 459 
shows that for any e > 0 we can find an integer N and then an Xo such 


that 
eo N 
, vna(X) — 2 Y_( 00) | < > (10.117) 


=] 


for X 2 Xo. By the same reasoning we have 


%° N+p 
Som - Yon |«: 
n=] n=] 


for X = X, and p a positive fixed integer. Choosing any X larger than 
both Xo and X, yields 


N+p N 
| > oS ) n( 2) | a (10.118) 
n=l n=] 


Now (10.118) holds independent of any X since no X appears in (10.118), 
so that for any « > 0 an integer N exists such that (10.118) holds for 
allp > 0. This is exactly the Cauchy criterion for convergence, so that 


a 


» v,(%) exists. Hence for e/2 > 0 an integer N, exists such that 


n=] ee 
>» dyx( ao) — > Un (% ) | < ; (10.119) 
n=] n=l 


Choosing the larger N of (10.117) and (10.119) and combining (10.117) 
with (10.119) yields 


| > v,(X) — ) vn( 20) | <eé (10.120) 
n= 1 n=] 

for X 2 Xo. Formula (10.120) is just the statement that 

lim > mn(X) = > mm(o) QED. 


x— 0 1 1 
Example 10.38. We consider the Bessel function 


(= 1)"(0/2) 
nin! 


Jo(z) = 


t= 


and show that I. " g-aJ o(x) dr = 1/*/2. The nth term of the series e~*J o(x) is 


460 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
= " an ° . 
Un(x) = e7? (Ue, and we see that u,(z) is continuous forz 20. Also 


z (=1)" f= 
vn(x) = i un(t) dt = D207 In! 0 ett" dt 
(=1)" f* tan gy oe (TUR)! 


and vn() = lim v(x) = 222n In! Jo 22en!n! 


xr—> © 


Let the reader show that lim v,(©) = 0. By the ratio test it is easy to determine 


n—> © 
that e~*Jo(~) converges for |z| s X, X arbitrary. Finally it remains to show that 
v,(z) converges uniformly for z 2 0. If we write 
n=Q 


[- ) 


| > vn (2) 


n=(@ n= 





2222 !n! 


< > en) (10.121) 





we cannot show uniform convergence since the series of constant terms in (10.121) 
diverges. However, through integration by parts the reader can readily verify that 


oe Nn = —tzan+2 ani | —tzon 
mG Hare | ett dt < Sten in : e—tt2" dt x>0 


so that > vn(z) is a series of alternating terms with |v,4:(z)| < |vn(x)| for z > 0, 


n = Q 
v,(0) = 0, n = 0,1, 2, .... For such a series the remainder after n terms of the 


series > vn(z) has the property that 

n=0 

(2n + 2)! 7 
Q2+2(n + 1)!(n + 1)! 





|Rn(z)| S |vnsi(z)| S (10.122) 


Since the right-hand side of (10.122) tends to zero as n becomes infinite, we note that 


@ 
v,(x) converges uniformly for x20. Applying (10.116) yields 
n=(Q 


 (—1)"(2n)! 


if e*JSo(x) de = rer 


Qh= 


In the next section (Example 10.40) we shall show that 


(l+2)+ = (Cn Speneet (10.123) 


n= 


For z = 1 we have fr e~*Jo(x) dz = 1/+/9. 


REAL-VARIABLE THEORY 461 


Problems 


oo 


1. Let S(z) = > een z 20. Show that I S(z) dz = 5 


oe 
1 
Ss 
n= 


ns ] 


2. We wish to determine f(z) such that 


i. * e-**f (x) da = s=0 


1 
V1 +8? 
Assume f(z) = > a,x", integrate formally, make use of (10.123), and show that 


n=(Q 
f(x) = Jo(z). Then justify your work. 


3. Consider f,(z) = n?ze-"*, n = 1, 2, 3, .. . , show that lim I fr(z) dz # 


n— © 


f * lim fx(z) dx, and determine which one of Dini’s conditions is not fulfilled. 
n— 0 


4. Consider f,(z) = nz/(1 + n?z?), n = 1, 2, 3, . . . , and show that 
lim [ne dx = I lim fn(x) dz 
n— © 0 0 n> 0 


Do Dini’s conditions hold for this case? 


10.23. Taylor Series. Let f(x) be a function defined on cS 2aSd 
such that f’(r), f(a), . . . , f(x) exist on (c, d) with f™ (xr) R-integrable. 
We note that, force SaXSaxSdorcSuxSaKd, 


[  f00(t) dt = fo-P(2) — fo (a) 
f dx [ f(t) dt = fO—? (x) — fO-D(a) — fO—D(a)(e — a) 
l’ dx [ ae [ " f0(t) dt = fO-D (x) — fO-B(a) 
— f-(a)(x — a) — f(a) a 


Continuing this integration process yields 


[oa [a [++ Pra = fe -1@ - row -0 
7" 


"(q) EEO" _ pq EO _ ... — gong Bo 
n—l1 
soehat fiz) = - fo(a) FO" + RAG) 
(10.124) 


R, (2) «fe [oa fr a [00 at 


462 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


If M is any bound of |f(z)| on the interval (a, x), we have 


Rael sf lae| flaz| ff arias = MET 2h 10.126) 


Another form of #,(x) can be obtained if we note that F(x) defined by 


Fe) = gta | fo@@ ord Fla) = 0 
yields 
Pe) = gag | Oe -orta — P(@) =0 


Pe) = aa [rowe ~i-*dt  FMa) =0 


e® 3«© @ © @© @ @ @#®  e@ e@ @  @  @  @®  &  @ we @#  e  @  @  @  &®  @ &®  @©  &  @ $$ #&  &  @# « 


FO (2) = f(a) 
by applying the results of Prob. 4, Sec. 10.17. Integrating 
FO (x) = f(x) 


n times over the range (a, x) yields F(x) = R(x), so that 
1 x 
= (n) eins n—1 
R(x) nD! [ {MBH(e — tt) dt (10.126) 


The inequality of (10.125) results immediately from (10.126). 
If it is known that f(z) has derivatives of all orders and if it can be 
shown that lim R,(z) = 0, then (10.124) yields the important Taylor- 


n> @ 


series expansion of f(x) about x = a, 


f(x) = > F0%@ 2 (10.127) 
r= 


It must be emphasized that (10.127) cannot be used as an expression for 
f(z) unless R,(r1) — 0 as n— o (see Example 10.41). A special case of 
(10.127) occurs if a = 0, with 


f(z) = y #0) (10.128) 


r= 


Equation (10.128) is the Maclaurin-series expansion of f(z) about the 
origin. 


REAL-VARIABLE THEORY 463 


Example 10.39. Consider f(z) = e* with a = 0. All the derivatives of f(z) exist 
for |z| S X, X arbitrary. For |z| S X we have |f™(z)| = |e7| S e*, so that 


ni = ni! 








\Rn(z)| < lel < X 


n 
The reader can verify that lim = = 0. A simple proof of this statement occurs if 
n— @ : 


: x" en % 
one considers the series, ) aT’ which is seen to converge by the ratio test Hence 


n=Q 
the nth term, of necessity, must tend to zero. Thus |R,(z)|— 0asn— o for all z, 


and e7 = > ~ since f(0) = |]forr =0,1,2,.... 


r=@0 
Example 10.40. Consider f(z) = (1 + 2)~4. The reader can verify that 


1-3-5---(Qn—1 2n)! 
f(0) = (—1)" nc 8 ha = (—1)" oe, 
n= 0,1, 2,3, .... Moreover 
2 
f(a) = | SP a + a)-op 
and for |z| S o < 1 we have 
(2n)! on 
lRn(x)| S 322 1n! @ —o)"4 (10.129) 
Let the reader verify that lim R,(z) = 0. Thus 
n— @ 
ise) te S Cre ten)" an It] <1 (10.130) 
n=(Q 
The series of (10.130) converges for z = 1 and diverges for z = —1. The reader can 
verify directly that lim F&,(1) = 0. 
Nn © 
Example 10.41. Consider f(z) = e~1/**, x # 0, f(0) = 0. We have 
f’(0) = resis F(z) = FO) 2 eee A = lim oe 0 
a 2-0. z—0 x 
To compute f’’(0), we note that 
#0) = lim L22—FO) « jim 2 
20 z—0 z>0 «4 
It can be shown that f™(0) = 0 for n = 0, 1, 2, 3, .... Hence the Maclaurin 


J ze Sac 
series > fm) | converges to zero for all values of z since every term of the series is 


n=(@ 
zero. Returning to (10.124), we note that f(z) = e~/*? = R, (zr) for all n so that 
R,(x) does not tend to zero as n becomes infinite. The Maclaurin series of f(2) does 


464 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


not converge to f(z) for this example. We note that the convergence of a Taylor- 
series expansion of a function f(z) does not guarantee that the series converges to f(z). 
One must always investigate the remainder, F(z). 


Another form of the Taylor series can be obtained if we replace x by 
a +h, so that 


fiat+h) = > f(a) : (10.131) 
r=0 
If we allow a to vary by replacing a by x, we obtain 
hr 
fe +h) =) 4G) 5 (10.132) 
r= 
Equation (10.131) is very useful if we know f(a) for all r and if we 
wish to find an approximate value of f(a +h). If only a finite number 


of terms of the Taylor series are used in approximating f(a + h), an esti- 
mate of the size of the error can be obtained from (10.125). 

Example 10.42. In Example 10.39 we saw that |R,(z)| S eX|z|"/n! for the Mac- 
laurin-series expansion of e7, |z| < X. If we wish to find an approximate value of e, 


we let xz = 1 and note that |R,(1)| S e/n! < 3/n!. For n = 8 we have |R,(1)| < 
7 


3/8! < 0.0001, so that > x ~ 2.718 yields a value of e accurate to three places. 


r=(0 

Actually e = 2.71828 .... 
The Taylor-series extension to a function of more than one variable is 
not very difficult. Consider f(z}, x’, ... , 2") as a function of the n 
variables x1, 77, ...,2". Lethi,h’, ... , h” be any set of constants, 


and define g(t) by 


p(t) = f(x! + Alt, x7 + At, >. . , a™ + ht) 
Soo x UP) ut = zt + hit @=1,2,...,n 
(10.133) 


We consider x}, x7, ..., 2" as constants temporarily so that g(t) is 
looked upon as a function of the single variable t. Let us assume that 
y(t) has a Maclaurin-series expansion which converges fort = 1. We 
have 


@ 


g(t) = > ae 


Now di jue ob” due 


REAL-VARIABLE THEORY 465 


and, at ¢ = 0, 
de\ _ Of), See - 
(#) = 2a ut = 2 att = 0 
2 2 B 2 
Similarly ae = sed he ee heh 


dt? «daub dux dt  du® due 


2 2 
and 9) 2 elie heh®. Continuing in this manner yields 
xu () 





dt? dx% x8 
of of {2 
= 1 2 n _/_ he —__’__ JapB } __ e ee 
ott) = seatsat,. oa) + (Lae)et (2 pene) © 4 
anf apB.. . i" oe 
+(55 BS aa? one iw)" 1 
For t = 1 we obtain 
fei + hath... ath) = flea, ... 2%) 
Of a 1 o# ahp o 8 « 
T one" * 95) dxf age T 
Sh tO ae Gd VG hase, IOEID) 
n! oxy - + + dx8 dx 


For a function of two variables (10.134) becomes 


f(z + h,y + k) = fe, + (a% 4 22) 





1 /, of af\? 1, af af\™ 
+ p(at+e%) oo Ty ae Fay + (10.135) 
af af\~ n\,,, orf 
where (: ax + k a = h k da" dy™* 
r=(0 
Problems 
1. Show that sin z = Y 2 eal holds for all values of z 
' " (2n + 1)! " 
~ (—1)"*z?" 
2. Show that cos z = > (Qn)! holds for all values of x. 
n=0 
8. Show that sinh z = ora and cosh 7 = 2 on hold for all values 
n= n= 


of x. 


y (alten 
4. Show that In (1 + 2) = ges re holds for —1 <2 $1. 


n=] 


466 ELEMENTS OF PURE AND APPLIED MATHEMATICS 

















5. Show that Ini = 2 te ~ holds for || <1. IfN = 7+# 
that0 <z<1. Determine in ; accurate to four places. 

6. How many terms in the Taylor-series expansion of sin z about z = r/6 are 
needed to determine sin 31° accurate to six places? Evaluate sin 31° accurate to six 
places. 

7. By integrating 1/(1 + 2?) find the Maclaurin expansion of tan! z. 

8. Show that 


o 


at (2n + Gz Qn + 1)(2x + 1) 


z>0 





9. Find the first four terms of the Maclaurin-series expansion of In cos z. For 
what range of z is the series valid? 

10. Find the Taylor series of cos? x about z = 7/3. 

11. If f(z) is continuous on (a, x), show that 


Ra(z) = f(a + O(a — ay) = ar" 0<0<1 


= (f(a) +g F— 


by making use of (10.126). 


10.24. Extrema of Functions. The Lagrange Method of Multipliers. 
One says that f(x) has an extremum at a point x = c if an » > 0 exists 
such that f(c + h) — f(c) has the same sign for |h| <7. Iff(e +h) S f(c) 
for |h| < », we say that f(z) has a local maximum at z=c. If 
fle + h) 2 fic) for |h| < 9, we say that f(c) has a local minimum at z = c. 
If an extremum of f(z) at x = c exists and if f(x) is differentiable at x = c, 
we have 


jie lim He + = LO <0or >0 


(10.136) 
fice) = inf =SE= 8 20ors0 


so that f’(c) = 0. Now assume f’(c) =f"(c) = -- + =f"-YX.(c) =), 
f™(c) ¥ 0, and further assume that f(z) is continuous in a neighbor- 
hood of z = c. Applying (10.124) and the result of Prob. 11, Sec. 10.23, 
we have 

c)" 


fla) = flo) + FO + q F5”" 
fle +h) — #0) =“ [f(@) + 4 


Since f™(c) # 0 and since e — 0 as h — 0, we can choose A sufficiently 
small so that f(c) + € is one-signed. Hence f(c + h) — f(c) will be 


REAL-VARIABLE THEORY 467 


one-signed provided n is even (h can be positive and negative). Thus a 
necessary and sufficient condition that f(z) have an extremum at x = c 
is that f’(c) = 0 and the first nonvanishing derivative of f(z) at z =c 
be of even order. If f™(c) > 0, a minimum occurs, while if f™(c) < 0, 
@ maximum occurs. Why? 

For a function of two variables we have 


f@thy tk) —f,y) | 
a peg h? + 2fyhk + fyyk?) ++ - - 
Ox oy 20 wy 
If we neglect the higher-order terms (h*, hk, . . .), we note that f(z + h, 
y +k) — f(z, y) will be one-signed provided of = 0, 2 = 0, and (fish? + 


2fayhk + fy,k?) does not change sign for arbitrarily small positive and 
negative values of h and k. Let the reader deduce that f(z, y) has an 





ett of, of _ 
extremal at (x, y) provided a oy 0, and 
af \’ af af 
(,2 r) da? dy? <0 (10.137) 


If f.2 > 0 or f,, > 0, a minimum occurs, and if f,. < 0 or f,, < 0, a maxi- 
mum occurs. Why? 

Let us consider the function z = f(z, y) to be extremalized under the 
condition g(x, y) = constant. If y(z, y) is a closed and bounded curve, 
and if f(z, y) is continuous at every point of the curve (xz, y) = c, we 
know that there will be a point Po on the curve ¢(z, y) = c such that 
f(z, y) will take on its maximum value at Po. Remember that we restrict 
P(a, y) to lie on the curve g(x, y) = c. The same statement applies if 
we consider the minimum value of f(z, y). Now in the elementary calcu- 
lus we would solve g(x, y) = c for y as a function of x, say, y = ¥(2), 
substitute y = (zx) into z = f(z, y), to obtain z = f(z, ¥(x)), and then 


we would set & = (0. We also have 


dz _ of , ofdy _ af af | —2e/ee =0 (10.138) 





— ee ——e 
aR 


dx ox | dydx Ox dy| dy/dy 
since <e + 3 wv = (0. We can obtain (10.138) by another procedure 
due to Lagrange. Consider the new function 


U = f(x,y) + dre, y) (10.139) 


We differentiate U with respect to x and y, assuming that 2 and y are 


468 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
independent variables, while \ is considered as a parameter. Setting 


aU . ~ 
ae = 0, —— = 0 yields 


a 4 120 


4 
ay 


Eliminating \ in (10.140) yields (10.138). This is Lagrange’s method of 
multipliers. 


(10.140) 
+ 2 
ay 





More generally, if y = f(a, v2, . . . , tn) 18 to be extremalized subject 
to the conditions ¢,(%1, 2, .. . ,%) =G,t1 = 1,2, ...,m, we form 
CSA Gi bij. & 4d Bal oF » NiGi(t1, T2, . . - , Ln) 
a] 
and we set 
st + (Seo Fel Siecn he Gia 
don, } ) ) 
The elimination of the X,, 7 = 1, 2, ... , m, along with the equations 
v,(%1, 22, ...,%n) = G,1 = 1,2, ... ,m, yields the values of x, x2, 


, Xn, which extremalize y. 


Example 10.43. We wish to find the ratio of altitude to radius of a cylindrical glass 
(open top) having a maximum volume for a fixed surface area. We have 


V = wry S = 2xry + rz? = constant 
From U = rx*y + rA\(2rzy + rz?) we have 


aU 
ap = Salty + Cy + 2) = 

oe awrle? + X(22)] = 

Eliminating A yields y/x = 1. It is easy to show that y/z = 1 yields a maximum 
volume. 


Problems 


1. Consider a cylindrical buoy with two conical ends. Let z be the radius of the 
cylinder, y the altitude of the cylinder, and z the altitude of the cones. Show that 


Liy:z = a :1:1 yields a maximum volume for a fixed surface area. 


2. Find the maximum distance from the origin to the curve z? + y3 — 3zy = 0. 

8. Derive (10.137) if f.zh? + 2fzyhk + fy,k? does not change sign for arbitrary 
values of hk and k. 

4. A rectangular box has dimensions z, y, 2. Show that x:y:z = 1:1:1 yields a 
minimum surface area for a fixed volume. 


REAL-VARIABLE THEORY 469 
6. Consider the set of values yo, y1, y2, . - - » Yn and the polynomial 
y(t) = ao tai + ar? +--+ + anr™ 


m <n. Show that the set of a,,7 = 0,1, 2, ... ,m, which extremalize 


n 
S = (y, — Za,x})? 
3=0 


m n nr 
satisfy > 81440, = t with s, = > ct, = > y ier. 
2=0 27=0 j=0 


10.25. Numerical Methods. Let us assume that we wish to find a root 
of f(x) = 0, and let us further assume 
that f’(z) exists. We can plot a 
rough graph of y = f(z) and attempt 
to read the value of z at which the 
curve crosses the x axis, y = O (see 
Fig. 10.4). 

If we choose a point x not far re- 
moved from £, f(Z) = 0, wenote that s 
the equation of the tangent line at (11, . te: 

anes? y=f(x) / 
f(t1)) 18 y — f(a) = f(a) (@ — 21). Fig. 10.4 
This line, L, intersects the x axis at a 
the point z2 = 2, — f(x) /f’(x1), f’(t1) ¥ 0, obtained by setting y = 0. 
This process can be continued, with 


_ F(tn) 


y 








Inti = tn f',) f'(an) #0 (10.142) 
If the sequence 21, ro, . . . , In, . . . can be shown to converge to a 
limit c, then 
f(2n) 


n> © n—- 0 n—- @ Sf’ (en) 


and c = c — f(c)/f'(c) so that f(c) = 0, provided f(x) and f’(x) are con- 
tinuous at z = c,f’(c) #0. Thusc = = = lim zy. It is apparent that 


n> © 


difficulties will occur if f’(z) is small near x = &. 





Example 10.44. For f(z) = x? — 2 = 0 we have f’(x) = 22, and (10.142) becomes 


ose. a ee 


10.143 
2rn 22 ( ) 


Tn41 = Ln — 


From (10.143) we note that 0 < aa41 < tn provided z? > 2,2, >0. Let the reader 
deduce that 2°,, >2 if 22 >2. Thus if we choose x: = 2, (10.143) will yield a 


470 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


monotonic decreasing sequence bounded below by zero. The sequence must converge 
to a solution of xz? — 2 = 0. The computation may be arranged as follows: 





2 

n In x + 2 22n Ln+i = = as : 
22 

1 2 6 4 1.7 

2 17 4.89 3.4 1 44 

3 1 44 4 0736 2 88 1 414 

4 1 414 3 999396 2 828 1 4142 

5 1 4142 3 99996164 2 8284 1 41412 


The desired root to five decimal places 1s x = 1.41412. 


An iteration process may be used to find a root of x — g(x) = 0 or 
x = g(x). Weshall assume that |y’(xz)| < A <1. We define x,,; by 


Lnt1 = 9(Ln) n=1,2,3,... (10.144) 
with 2; arbitrary. Thus 


G(Ln) — 9(%n-1) 
(Ln —~ Ini)’ (En) 


Ln4+1 — In 


with &, lying between x,_, and x,, by applying the law of the mean. Thus 


\nz1 — €n| = [tn — Fail * |9’(En)| < [tn — fa-s|A (10.145) 


Now 3 &n = to + (@1 — Xo) + (fe — 41) + 6 + (Xn — Tn-1) 
and xo + |v, — 2%—1| converges by the ratio test since erty 
n ~ ¢An—] 


k=l 
A <1, from (10.145). Thus 


n 


lim z, = Zo + lim > (te — Le-1) = EF 


n> 0 n> @ 
k=] 


exists. From (10.144) and the continuity of ¢(x) we have 


lim zt, = = lim g(%a-1) = ¢(Z) 


and # is the desired root of x = ¢(z). 
Example 10.45. We find a root of x = ¥ sin z +1, g(x) = + sin x + 1, 
le'(z)| = gleos 2| <5 


The computation is arranged as follows: 


REAL-VARIABLE THEORY 471 


n In sin Zn | 2n41 = 5 SiN Zn + 1 
1 0 0 1 

2 1 0.8415 1.4208 

3 1 4208 0 9888 1.4944 

4 1 4944 0 9971 1 4986 

o 1.4986 0 9974 1.4987 

6 1 4987 0.9974 1 4987 


We started with 2; = 0 and obtained the desired root, x = 1.4987, accurate to four 
decimal places. Had we graphed y = x and y = g sin z + 1, we would have noticed 
that z; = 1.5 would be a good starting point. 


It is often useful to replace a given function by a polynomial which 
approximates the given function to a high degree of accuracy. Such a 
polynomial is called an interpolating function. Let f(x) be a function 
defined for ro S x S 2p, and let us assume that we have evaluated f(x) 


at Xo, X1, to, . . . , tn, Obtaining the n + 1 values y, = f(z,), z = 0, 1, 


2,..., n". <A simple way of constructing a polynomial of degree n, 
La(x), which satisfies L(x.) = y%.,7 = 0,1,2, ... ,n, is due to Lagrange. 
Let us note that the polynomial 
‘ (a — %1)(% — 42) + * * (X — Qn) 
"(xo — 21)(%o — %2) * * * (to — an) 
has the value zero at x = 21, 42, . . . , 2, and has the value yp at x = 2». 
A sum of such polynomials yields Lagrange’s interpolating polynomial, 
(2 — 41)(@ — Xe) > * + (@ — Bn) 
Tey Sati ee ee 
ys (xo — X1)(to — Z2) * * * (Lo — Fn) 
(x — Xo)\(@ — te) > * > (& — Fn) a 
ry (21 — Xo)(%1 — Z2) * * * (41 — Fn) a 
(x — %o)(@ — 41) + + + (@ — Inn) 
gO ooooow (10.146 
7 (Ln ess Xo) (Tn ie 11) ae (In = Tn—1) ( ) 


One naturally inquires how closely L,(x) approximates f(x) forz # z,, 
7=0,1,2,...,n. To answer this question, we construct F(x) defined 
by 

F(x) = f(x) — L, (rz) — R(x — x0)(@ — 41) *> + * (©@ — 4p) (10.147) 
with R a constant. Let us note that F(x) vanishes for x = 20, 2X1, 22, 

. , tn, Since f(z,;) = L(z,), 7 = 0, 1,2, ...,n. Now let us choose 
any point % of the interval xz) S x Sz, other than 2, 2, %2, . . . , In 
and determine R so that F(Z) = 0, that 1s, 


f(@) — La(B) — RUE — 20)( — 21) ++ + (@— aq) =O (10.148) 


Up to the present no restriction was placed on f(z). If we assume that 
f(x) is differentiable at every point of ro S x S ap, then F(x) is differ- 


472 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


entiable on this range. If we apply the law of the mean (or Rolle’s 


theorem) to the intervals (xo, 1), (41, 22), . . - » (4, %), (%, M41), . =. 

(2n—1, Zn), we note that F’(x) will vanish at n + 1 points. If we further 

assume that f(z), f’’(x), .. . , f@*t(a) exist on %» S x S Xn, we note 

that F@+)(z) will vanish at a point x = 8,27) Ss Sz. Since L,(z) is 
n+1 

a polynomial of degree n, we have ora) = (0. From (10.147) we 

obtain 


FOt)(s) = 0 = f(s) — Rin + 1)! to SS S2n 


ine a [(w — xo)(e — 21) ++: (u — 2y)] = (n +1)! Substituting 
= ft(s)/(n + 1)! into (10.148) yields 


since 





z or (8) = 7 
f(#) — L,(#) = int 1! (E — to)(¥ — 11) °° + (¥—2,) (10.149) 


Equation (10.149) holds for any value of Z in the interval x) Sz S zy 
although (10.149) was obtained by choosing +t # x,,7 = 0,1,2,...,7N, 
for one notes that f(x,) = L,(x,) is consistent with (10.149). 

If M is any upper bound of |f‘"t)(x)| on xo S x S a, we have 


|f(z) = L,(z)| S (n+ 1)! = 1)! lz — Lo| |x ~ x1| ne |x = Ln (10.150) 


The inequality (10.150) is useful in finding an upper bound to the differ- 
ence between f(x) and L,(2). 


Example 10.46. The third-degree polynomial passing through the points (0, —5), 
(1, —3), (2, =); (3, 7) 18 


_5 @ 1)(x — 2)(x — 3) _ a) t(z — 2)(x — 3) 
ne (=1)(—a(-3) F (—9) “7 y(=9) 
x(x — 1)(a — 3) a(x — 1)(x — 2) 
3 ey 


= 7? — 372 +47 —5 


Example 10.47. Let Le(x) be the polynomial which coincides with sin xz at x = 0; 


r/6, r/3, w/2, gr, gr, w. Since | = 4 -\s1= M, we have 


3-6-9) EB G-B ee 
e{zr—e)lr—ajlz—5)\2 x x) (x r)| 
7! 

For x = 40° or #7 we have 


wn "14401 =31() (4) 6) OG) @ “AQ 


< 0.00005 











sin z — Le(x)| S 


Thus the use of Le($-r) would yield a value of sin 40° accurate to four decimal places. 


REAL-VARIABLE THEORY 473 


Let us consider a function f(x) defined on —h Sx Sh. We con- 
struct the polynomial Le(x), which coincides with f(z) at x = —h, 0, h. 
From (10.146) we have 


La(z) = f(—h) ®2—™ + peo) SEVEN D 4 yy BEDE 
= [ Ln(x) dt = 5 = Lf(—h) + 40) + £)] (10.151) 


We may well inquire how the value of (10.151) differs from i . f(x) dx. 


This difference is 
h 
oth) =f fe) de — Rig) + 4f(0) + 500) 


We note that 9(0) = 0. Let us assume that f’(x), f(x), f’’(x), and 
f(z) exist. Differentiating g(h) with respect to h yields 


'(h) = - + A(R) = L(A) + ” +f) — gI-F(-) $700) 
= $10) +.1(-W] - $70) — ZL") — F(A) 

and ¢’(0) = 0. Differentiating again yields 

”(h) = 319) = F"(—B)] — ZU) = F'(—)) - FF") $B) 

= xl) — FB] = FF") +9") 

so that 9’’(0) = 0. Differentiating once more yields 

of”"(h) = FL) + $B) = SIF") + F"(—W)) = BU") = FB) 

3") — 7'"(-h)) 


2 
— sot) hs hSh 


Now 


y(h) 


fo an [oak fo (h) dh = [dh [ah f° (—8)nego(n) dh 


If M is any upper bound of |f*”(z)| on —h S x Sh, we have 


lo(h)| $M [ \dh| [ |ah| e 2h? \dh| = me (10.152) 


474 ELEMENTS OF PURE AND APPLIED MATHEMATICS 
The inequality (10.152) yields an estimate of the error if we use 
Simpson’s rule as given by (10.151) to approximate fp, f(x) dx. We 


note that Simpson’s rule is exact if f(x) is a polynomial of degree less 
than or equal to 3 since f“”(x) = 0. Since (10.151) depends only on 
the values of f(z) at the oe spaced points —h, 0, h, we can extend 
(10.151) to the interval a S$ x S b by subdividing (a, b) into an even 
number of intervals 


a,a+ha+2h,a+3h,...,a+(n—1)ha+nh = b 
applying Simpson’s rule to the intervals 

(a,a + 2h), (a + 2h,a+ 4h), ...,(a+(n — 2)h, a + nh) 
and adding. This yields 


b 
[ s0@) dx = Risa) + 44a + h) + fla + 2h) 
+ 3 [fla + 2h) + 4f(a + Ba) + fla + 4h)] + 
+ alfa t+ (n — 2h) + 4fla +m — 1h) +f] + E 


= 2 (f(a) + fa +h) + 2f(a + 2h) + 4f(a + 3h) 
+ fa + 4h) +--+ +470 —h) +f) +E (10.153) 


5 4 as 
with |E| < 5 ae safe eae ee) (10.154) 


~ 180 180 





M is any upper bound of |f¢v)(x)| on the intervala S$ x Sb. The greater 
the number of subdivisions chosen and hence the smaller the value of h, 
the more accurate becomes Simpson’s rule for approximating an integral. 


2 
Example 10.48. If we apply Simpson’s rule to In 2 = is =, n = 10, we note that 


4 
lE| s on 37 Ss a (0.1)4 < 0.0002, so that we can expect to obtain a value of In 2 


accurate to at least three decimal places. Applying (10.153) yields 
1 4 2 4 2 4 2 4 2 4 1 
nda 3 (1 ta titptntptmtoatitit) 
= 0.69314 


In the tables one notes that In 2 = 0.69315 accurate to five decimal places. 


We conclude this section with a discussion of the Euler-Maclaurin sum 
formula. Let us suppose that f(r) has continuous derivatives at least to 


REAL-VARIABLE THEORY 475 


the order r on the interval0 SxS 1. If we integrate by parts, we obtain 


ff f@) ae = af(a) |) — J af @ ax 

=f) — [i 2f"(@) de 

= Hf) + #1 + 4) — fO1 - [) 2f'@ ae 
Hf0) +J/M1+4 [/@ dx — f) af'(@) de 
31/0) + FD) — fo @ — BF'@) ae (10.155) 


We define B2(x) such that Bj(x) = x — 4, B.(1) = 0. Integration yields 
B(x) = 4(2? — x). Replacing « — § by Bj(z) in (10.155) and integrat- 
ing by parts yields 


[, f@) dz = 310) +f) + fo Bew)f"@) dz (10.156) 


We now define B;(x) such that Bj(7) = Bo(x) + be with the stipulation 
that B;(0) = B3(1) = 0. Thus B;(x) = 23/6 — 2?/4 + bex + ¢, and 
c= 0, bo = yy. From B(x) = By(x) — be we have upon integration 
by parts 


if. f(x) dx = gf) + f(1)] — bal f'G) — f°)] — I ' By(a)f""(a) dx 
This process is continued so that a sequence of polynomials (Bernoulli) 


and a sequence of constants (Bernoulli) are generated. We have 


Bi(a) = By_1(2) + b4—1 k = 3 
B,(0) = B,(1) = 0 


(10.157) 


and 


f, f@) dx = 34) + FD] ~ bil") — FO) + BAF") — FO) 
=a (ye tO) Pap ee ee Deo 0 (0)) 
Aoi =i) i: * B!,,(x)f(a) dx (10.158) 
In the generation of the Bernoulli polynomials and constants no mention 
was made of Bo, Bi, 61, bo. For convenience we choose B,(x) = 0, 
By(a) = I, bo = 1, b; = Q. 


Let us see whether it is possible to find two functions, ®(z, t), ¢/(t), 
such that 


&(zx, t) = > B,(x)t" 
oh (10.159) 
e(t) = ) de" 


r=Q 


476 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


If (x, ¢) and g(t) exist satisfying (10.159), we call them the generating 
functions of the Bernoulli polynomials and Bernoulli numbers, respec- 
tively. The series expansions are simply the Taylor series expansions 


(see Sec. 10.23), so that b, = o(0)/r!, B,(z) = Ao ») . For- 
t==:() 





r| ot" 
mally we have 


{2} 


O(a, t) _ au 
eT ; Bliavt 


r=(0) 


= Bie)t + Bye\e +) Buae 
r=3 


t+(a—-yet+ » [By_1(a) + 6,1]? 
r=3 


t+(x—s)P +t > (B.(x) + 6,)¢° 


g=2 
2 


=t+(@-—g)P +t : (B.(x) + b.)t* — t(1 + at) 


s=0 
= (0(x, t) + to(t) — 4? (10.160) 
so that F(a, — (&(x, t) = te(t) — 5? 


Integrating with respect to x yields 

B(x, t) = A(tje™ — o(t) + a 
with A(t) an arbitrary function of t. Now &(0, ¢) = 0 since B,(0) = 0, 
so that A(t) = g(t) — #, and 

&(z, 1) = [o() — He" — 1) 


Finally #(1, ‘) = > B,(1)t" = ¢ since Bi(1) = 1, B,(1) = 0, r¥ 1. 
r=Q 
Hence [y(t) — gt](e* — 1) = 1, and 


t t 
g(t) My ha 
] 


ext co 
&(2, t) = a1 


One can now consider ¢(é) and ®(z, ¢) of (10.161) and justify the series 
expansions (10.159) along with the term-by-term differentiation given in 


(10.161) 


REAL-VARIABLE THEORY 477 


(10.160). One then easily shows that (10.157) is valid and that Bo(x) = 0, 
B,(x) = z, bo = 1, b} = 0. We omit proof of this fact. An important 
property of the Bernoulli constants is discernible if we note that 





t { t tet 
OO a gt 
t t t 
= -gtttsy =stqh =o 


Hence ¢(t) is an even function so that be,,, = 0 for all r [see (10.159)]. 
If we take r = 2s + 1 and apply (10.157) to (10.158), we obtain 


[> fle) dx = 4[f0) + £@)] — bal fA) — £0) 
— bf") — F'"(O)] — + bad fOr) 
= f-9O)] =f Baya@fer? (@) de (10.162) 


If we apply (10.162) to g(x) = f(a + 1), we obtain 


[) 9(2) dx = fi f@ + 1)dxr = [PI@) ax 
= HF) + f@)] — bilf'@) — FD] — +++ = f? Bay@fer”(e) ae 


Continuing this process for the intervals (2, 3), (8, 4),...,(n—1,n) 
and adding yields the Euler-Maclaurin formula, 


+ 4f(n) — bal f'(n) — f'0)] — bAS'"(n) — f°") — + + - 
— Doel f(22-» (n) faa. f-Y(0)] 


n—1 
- > i Boeyi(x — m)f*t(z) dx (10.163) 

m= 0 
The Bernoulli polynomials and constants of (10.163) can be calculated 
from (10.157) or (10.161). It can be shown that if f@*+) (x) is monotonic 
decreasing for 0 S$ x S n and if the odd derivatives of f(z) have t he same 
sign on 0 S$ x Sn, then the true value of I ° f(x) dz always lies between 
the sums of s and s + 1 terms of the series following $f(0) + f(1) + °° ° 


+ f(n — 1) + ¢f(n) in (10.163). For this case we need not be concerned 
with 
n—-1 nit 
DLP" Basle — mfr? (2) de 
m=( 
Example 10.49. Stirling’s Formula. For f(z) = In (2 + 1) we note that 


(2s)! 


fMrD(z) = Gap 


>0 


478 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


for 0 $ 2 $n and f‘+)(z) is monotonic decreasing on 0 Sz Sn. Integration by 
parts yields 

n 
(n+1)In(n+1)—n =f, In (2 +1) dex =Z¥lnli+m2+n3+--- 


+n (n — 1) +$inn bs (5 -1) -» (Gt - 2) — +++ (10.164) 
Now 


aQt)inat)=@+yin[a(t +<)| =(n+1)Inn 


+ir+0in(1 +3) 
and, for large n, (n + 1) In (n + 1) © (nm + 1) In n, since 


lim (@+1yin(1+5) =1 


n— 0 
Neglecting those terms which tend to zero as n— ©, we note from (10.164) that 
(n+3)Inn—n=Inn!+C (10.165) 


where C is the sum of the constant terms in (10.164). Let the reader apply (4.47), 
with (10.165) to show that C = —$In 2x. Thus 


1 n! 
n ———. * -nN 
n” \/2en 
nl = V2en ne" = VW I2rn (*)’ (10.166) 


Formula (10.166) is Stirling’s formula for approximating n! for n large. The true 
meaning of (10.166) is that 


! 
lim ——*, = 1 
an /Qnn (*) 


Problems 


1. Find a root of x? — 3x +1 = O lying between x = 0 and x = 1, accurate to six 
decimal! places. 
2. Find a root of z = In (x + 1) + 1, accurate to five decimal places. 
3. Find a polynomial p(x) such that p(O) = 0, p(1) = 1, p(2) = 2, p(3) = 3, 
p(4) = 11. 
1 
4. Apply Simpson’s rule with 10 subdivisions to approximate if 
dx 
V2 — sin? x 
6. Show that tr=. 32> BG) = i (ct — 223 + 2), 
6. Show that C = — In 2z [see (10.165)]. 


10.26. The Lebesgue Integral. It is rather obvious that not all func- 
tions are Riemann-integrable (R-integrable). We consider the interval 
0S x S 1 and define f(x) = 1 if z is irrational, f(z) = 2 if z is rational. 
The upper Darboux integral is seen to be 2, whereas the lower Darboux 
integral is 1 (see Sec. 10.16). Thus f(z) is not R-integrable on0 Sz S 1. 
However, let us consider the following: It is known that the rationals are 


dx 
1+ 2? Do the 


1 
same for i How accurate can you expect your answers to be? 


REAL-VARIABLE THEORY 479 


countable (see Sec. 10.11), so that we can enumerate the rationals on the 
interval 0 S x S 1, and we designate the rationals by 


M1, f2, 73, - « © y Vay 6 © 


We define f:(z) on 0 S x S11 by fi(x) = 1 for + ¥ 1, filri) = 2. We 
define fo(x) by fo(x) = 1,2 ¥ 11,72, f(ri) = f(r2) = 2. Proceeding in this 
manner, we construct a sequence of functions 


fix), fo(x), . . - ,fn(a), .. . (10.167) 


with f,(z) = 1 except at = 11,72, . . . , Tn, and at these rational points, 
fr(x) = 2. It is apparent that 


lim f,(x) = f(x) =1 for x irrational 
men 2 for x rational 


Moreover the sequence {f,(x)} is monotonic nondecreasing as n increases, 
and we write fn(x) f(z). The reader can easily prove that every func- 


tion of the sequence {f,(x)} is R-integrable, with f, : fa(z) dx = 1 forall n. 
We define the Lebesgue integral of f(x) as 


L(f) = if ’ f(x) dx = lim i ’ f,(z) dx = lim 1 =1 (10.168) 


Even though f(x) is not R-integrable, it is Lebesgue integrable (L-inte- 
grable), and the value of its Lebesgue integral is defined by (10.168). 

More generally, we have the following situation: Let C be the class of 
all R-integrable functions on the range a S x S BJ, and let f(x) be the 
limit of a monotonic nondecreasing sequence, {f,(z)}, of functions belong- 
ing to C. We define the Lebesgue integral of f(x) by 


Lif) = i ° #(2) dx = lim [ ° fa) da (10.169) 


provided the limit exists. Let C denote the class of all such functions 
which are L-integrable. Any function which is R-integrable is automati- 
eally L-integrable since f(r) = fa(x) 7 f(x) and 


lim [/f() dx = [" f@) de 


Thus C is a proper subset of C. Any function which is R-integrable is 
L-integrable, and the values of the two integrals are the same. It can 
be shown that if f(x) is a limit of a monotonic nondecreasing sequence of 
functions belonging to C then f(x) also belongs to C. An analogous situ- 
ation occurs in the real-number system. We define the irrational num- 
bers as the limits of bounded sequences of rational numbers. The limit 
of a bounded sequence of irrational and/or rational numbers is again a 
real number, rational or irrational. 


480 ELEMENTS OF PURE AND APPLIED MATHEMATICS 


To extend the class C in order to embrace a larger class of functions, 
we consider the following: A function f(x) is said to belong to the class 
C, if for any e > 0 we can find two functions, g(x) and h(x), belonging to 
C such that 

g(x) Sf@) Sh@) = forasxesdb 


and L(h) — L(g) <e nO; Ea0) 
If f(x) belongs to C, we define its Lebesgue integral by 
ef) = [ ° f(x) dx = sup L(g) = inf L(h) (10.171) 
@ geC heC 


for all g(x) and h(x) satisfying (10.170). The reader can easily verify 
that if f(x) belongs to C then f(x) belongs to C, and L(f) = £(f). The 
set C, contains all Lebesgue-integrable functions. 

The ideas contained above lead to the concepts of measurable func- 
tions and measurable sets. In most texts one begins with the definition 
of the measure of a set, and afterward the Lebesgue integral is defined. 
We omit a discussion of measure, but we do emphasize that the theory 
of measure is of prime importance in modern probability theory. The 
Lebesgue integral plays an important role in modern mathematical analy- 
sis with its applications to the Fourier integral, probability theory, ergodic 
theory, and other fields. 


REFERENCES 


Burington, R. S., and C. C. Torrance: ‘‘Higher Mathematics,” McGraw-Hill Book 
Company, Inc., New York, 1939. 

Courant, R.: ‘Differential and Integral Calculus,’’ Interscience Publishers, Inc., 
New York, 1947. 

Franklin, P.: “A Treatise on Advanced Calculus,’’ John Wiley & Sons, Inc., New 
York, 1940. 

Goursat, E.: ‘‘A Course in Mathematical Analysis,’’ Ginn & Company, Boston, 1904. 

Hildebrand, F. B.: “Introduction to Numerical Analysis,’ McGraw-Hill Book 
Company, Inc., New York, 1956. 

Householder, A. S.: ‘‘Principles of Numerical Analysis,’”’) McGraw-Hill Book Com- 
pany, Inc., New York, 1953. 

Kamke, E.: ‘‘Theory of Sets,”’ Dover Publications, New York, 1950. 

Milne, W. E.: ‘‘Numerical Calculus,’ Princeton University Press, Princeton, N. J., 
1949. 

Munroe, M. E.: “Introduction to Measure and Integration,’’ Addison-Wesley Publish- 
ing Company, Cambridge, Mass., 1953. 

Newman, M. H. A.: “Topology of Plane Sets,’’ Cambridge University Press, New 
York, 1939. 

Titchmarsh, M. A.: ‘‘The Theory of Functions,’ Oxford University Press, New York, 
1939. 

Widder, D. V.: ‘‘ Advanced Calculus,’”’ Prentice-Hall, Inc., New York, 1947. 

Wilson, E. B.: “Advanced Calculus,” Ginn & Company, Boston, 1912. 


INDEX 


A posteriori probability, 345 
A priori probability, 345 
Abelian group, 300 
Abel’s lemma, 450 
Abel’s test for uniform convergence, 451 
Absolute convergence, 444, 455 
Absolute tensors, 89 
Absolute value of complex number, 123 
Acceleration, 50, 52 
Addition, parallelogram law of, 38, 123 
of tensors, 89 
of vectors, 38 
Affine transformation, 313 
Aleph zero, 394 
Algebraic extension of a field, 324-326 
Algebraic numbers, 396 
Alternating series, 443 
Amplitude of complex number, 123 
Analytic continuation, 150-151 
Analytic function, 130 
Angular momentum, 115 
Angular velocity, 46 
Arc length, 52 
Archimedean ordering postulates, 385 
Arcs, rectifiable, 277 
Argand plane, 123 
Argument of complex number, 123 
Arithmetic n-space, 84 
Associated Legendre differential equa- 
tion, 230 
Associated vector, 94 
Automorphism, 304 
of a field, 331 
of a group, 304 
inner, 305 
Axiom of choice, 392 


Bayes’s formula, 346 

Bayes’s theorem, 344-346 
Bernoulli numbers, 475-477 
Bernoulli polynomials, 475—477 
Bernoulli trials, 349 
Bernoulli’s theorem, 358 
Bessel function, 232 


Bessel’s equation, 231 

Bessel’s inequality, 247, 258 

Beta function, 170 

Bianchi’s identity, 106 

Bienaymé-Tchebysheff inequality, 355 

Bilinear transformation, 125 

Binomial coefficient, 339 

Binomial theorem, 339 

Binormal, 53 

“‘Bits’”’ of information, 346 

Bonnet, second law of the mean for inte- 
grals, 457 

Boole, G., 388 

Boolean operator, 0 = z 5, 221 

Boundary of a set, 389 

Boundary point, 387 

Bounded set, 386 

Bounded variation, 277 

Brachistochrone, 289, 293 

Branch point, 149 

Buffon needle problem, 355 


Calculus of variations, 288 
problem of constraint in, 295 
variable end-point problem in, 293 

Cancellation, law of, 375 

Cantor set, 396 

Cantor’s theory of sets, 394-396 

Cardinal numbers, 394 

Cartesian coordinate system, 93 

Cauchy, convergence criterion for se- 

quence, 437 
inequality of, 391 
integral formula of, 143 
integral theorem of, 136 
nth-root test of, 442 

Cauchy’s distribution function, 361 

Cauchy-Riemann equations, 148 

Cayley’s theorem, 303 

Central of a group, 307 

Central-limit theorem, 359-361 

Character of a group, 312 

Characteristic function, 357 


481 


482 


Characteristic roots, 24 
Characteristic vector, 24 
Chi-squared distribution, 362-364 
Christoffel symbols, 96 
law of transformation of, 98 
Christoffel-Darboux identity, 244 
Closed interval, 386 
Closed set, 386, 388 
of orthogonal polynomials, 249 
Closure of a set, 389 
Coefficients, Fourier, 245, 252 
Cofactor, 9 
Combinations, 338 
Commutative law, of integers, 375 
of vector addition, 38 
Commutator, 303 
Comparison test for series, 441 
Complement of a set, 387 
Complete set of orthogonal polynomials, 
249-250 
Complex functions, 125 
continuous, 126 
differentiability of, 127 
(See also Complex variable) 
Complex-number field, 122 
Complex numbers, 122 
absolute value of, 123 
argument of, 124 
conjugate of, 124 
modulus of, 123 
vector representation of, 123 
Complex variable, functions of, 125 
analytic or regular, 130 
derivative of, 127 
integration of, 131-136 
Taylor’s expansion of, 145 
Components, of a tensor, 89 
of a vector, 41 
Composition, series of, 311 
prime factors of, 311 
Conditional probability , 345 - 
Conditionally convergent series, 444 
Confluence of singularities, 223 
Conjugate of a complex number, 124 
Conjugate elements of a group, 306 
Conjugate numbers, 325 
Conjugate subgroups, 306 
Conservative vector field, 67 
Continuity, 126, 398 
unjform, 399 
Continuous function, 398 
Continuous transformation groups, 314— 
315 
Contour integration, 160-162 
Contraction of a tensor, 90 
Contravariant tensor, 89 


ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Convergence, absolute, 444, 455 
conditional, 444 
of Fourier series, 251, 256 
interval of, 386, 453 
of power series, 453-454 
radius of, 146, 453 
of series, 439 
tests for, 440-444 
Cauchy’s nth-root, 442 
D’Alembert’s ratio, 441 
integral, 441 
uniform, 446, 450 
tests for, 451-453 
Abel’s, 451 
Coordinate curves, 95 
Coordinate systems, 40, 84 
Coordinates, cartesian, 93 
curvilinear, 62, 65 
cylindrical, 65 
Euclidean, 93 
geodesic, 104 
normal, 32 
Riemannian, 105 
spherical, 63 
Laplacian in, 103 
transformation of, 84, 411 
Cosets, 305 
Cosh z, 150, 158 
Cosine z, 131 
Countable collection, 394 
Covariant curvature tensor, 106 
Covariant derivative, 100 
Covariant differentiation, 100 
Covariant tensor, 89 
Covariant vector, 87 
Cramer’s rule, 10 
Craps, game of, 348 
Criterion, of Eisenstein, 328 
of Lipschitz, 183 
Cross or vector product of vectors, 45 
Curl, 59 
of a gradient, 60 
of a vector, 59, 83, 102 
Curvature, radius of, 53 
scalar, 106 
Curvature tensor, 106, 108 
Curve, unicursal, 435 
(See also Space curve) 
Curves, family of, 174 
Curvilinear coordinates, 62 
curl, divergence, gradient, Laplacian 
in, 65 
Cyclic group, 300 


D’Alembert’s ratio test, 441 
Darboux integral, 419 


Definite integral, 131, 419-424 
mean-value theorems for, 422, 457 
Del (V), 57 
De Moivre, formula of, 125 
Density or weight function, 237 
Dependence, linear, 20, 187 
Derivative, 402 
covariant, 100 
of a determinant, 8 
of an integral, 142 
of matrix, 13 
partial, 408 
of a vector, 50 
Desargues’ theorem, 40 
Determinant, cofactors of, 9 
derivative of, 8 
expansion of, 9 
of matrix, 5, 13 
minors of, 10 
order of, 6 
Vandermonde, 334 
Wronskian, 187 
Determinants, 5 
multiplication of, 9 
properties of, 7-10 
solution of equations by, 10 
Diameter of a set, 390 
Dice (craps), 348 
Differential, 402 
operator, 410 
total, 409 
Differential equations, 172 
Bessel’s, 231 
in complex domain, 201 
definition of, 172 
degree of, 175 
exact, 178 
first-order, 175 
homogeneous, 177 
hypergeometric, 221 
integrating factors of, 179 
Laplace’s, 226 
Legendre’s, 201, 222 
linear (see Linear differential equa- 
tions) 
nonlinear, 271 
partial (see Partial differential equa- 
tions) 
second-order, 199-202 
separation of variables in, 175 
simultaneous, 190 
singular point of, 207 
solutions of, existence of, 183 
general, 182, 191 
particular, 182 
uniqueness of, 185 
Sturm-Liouville, 200 


INDEX 483 


Differential equations, system of, 190° 
wave, 172, 286 ' 


Differential operator, s = 


Differentiation, 402 

covariant, 100 

of Fourier series, 262 

implicit, 403 

partial, 408 

rules, 61, 403 

of series, 450 

of vectors, 50, 61 
Diffusion equation, 367-368 
Dini’s conditions, 457-459 
Direction cosines, 43 
Directional derivative, 56 
Discontinuity of a function, 398 
Distribution, chi-squared, 362-364 

Gaussian, 353 

moment of, 352 
Distribution function, 351 

Cauchy’s, 361 
Distributive law, 376 
Divergence, 58, 102 

of a curl, 61 

of a gradient, 59 

of series, 439 
Divergence theorem of Gauss, 75 
Dot, or scalar, product, 42, 88 


d 
dz 


Kigenfunction, 24 
Eigenvalue, 24 
Kigenvector, 24 
Einstein, Albert, summation convention 
of, 1 
Einstein-Lorentz transformations, 320 
Eisenstein, criterion of, 328 
Electric field, 165 
Electrostatic potential, 76, 165 
Elliptic integrals, 436 
Entire functions, 145 
Equally likely events, 342 
Equation, Bessel’s, 231 
differential (see Differential equations) 
Euler-Lagrange, 290 
exact, 178-179 
indicial, 210 
integral, 203 
Laplace’s, 76, 78, 226 
Legendre’s, 230, 404 
linear homogeneous, 17 
principal, 326 
wave, 172, 286 
Equations, Cauchy-Riemann, 128 
of motion, Euler’s, 115 
Hamilton’s, 113 


484 


Equations, of motion, Lagrange’s, 110 
solution by determinants, 10 
systems of, 2, 4 

homogeneous linear, 17 

Equivalence relation, 376 

Error, in Simpson’s rule, 473-474 
in Taylor’s series, 462 

Essential singularity, 157 

Euclidean coordinates, 93 

Kuclidean space, 93, 107 

Eudoxus, 385 

Euleér’s equations of motion, 115 

Euler-Lagrange equation, 290 

Euler-Maclaurin sum formula, 474-478 

Even function, 256 

Exact equation, 178-179 

Expansions, in Fourier series, 252, 260 
in Maclaurin’s series, 462 
in orthogonal polynomials, 246, 251 
in Taylor’s series (see Taylor’s series or 

expansion) 

Expectation, 354 

Exterior point, 387 

Extrema of functions, 466—468 


Factor, integrating, 179 
Factor group, 307-310 
index of, 308 
Factorials, 168-171 
Stirling’s formula for, 477-478 
Family of curves, 174 
Field, 122, 323, 380 
automorphisms of, 331 
electric, 165 
extension of, algebraic, 324-326 
normal, 330 
of probability, 343 
of sets, 342 
steady-state, 41 a 
vector, conservative, 67 
irrotational, 69, 73 
solenoidal, 77 
steady-state, 41 
uniform, 42 
First-order differential equation, 175 
First theorem of the mean, 422 
Flat space, 107 
Force moment, 116 
Formula, Cauchy’s integral, 143 
interpolation, of Lagrange, 471 
Stirling’s, 477-478 
Fourier coefficients, 245, 252 
Fourier integral, 265 
Fourier partial sum, 245, 258 


ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Fourier partial sum, minimal property 
of, 248 
Fourier remainder, 247 
Fourier series, convergence of, 251, 256 
differentiation of, 262 
expansion in, 252, 260 
integration of, 264 
partial, minimal property of, 248 
trigonometric, 252 
Fourier transform, 268 
Frenet-Serret formulas, 52-53 
Frobenius, method of, 215 
Function, analytic, 130 
Bessel, 232 
beta, 170 
of bounded variation, 277 
characteristic, 357 
of a complex variable (see Complex 
variable) 
conjugate, 124 
continuous, 126, 398 
piecewise, 258 
sectionally, 258 
uniformly, 399 
density, or weight, 237 
differentiable, 127, 402 
discontinuous, 398 
distribution, 35] 
entire, 145 
even, 256 
expansion of, 145 
extrema of, 466—468 
factorial, or gamma, 168 
generating, 234, 476 
homogeneous, 177 
hypergeometric, 222, 223 
integrable, Lebesgue, 478-480 
Riemann, 419-424 
square, 246, 257 
limit of, 397-398 
memory, 270 
of more than one variable, 407 
odd, 256 
potential, 111 
probability, 352 
rational, 434 
of a real variable, 396 
regular, 130 
symmetric, 320, 321 
transfer, 270 
Functional, 292 
Functions, orthogonal, 237 
(See also Complex functions) 
Fundamental theorem, of algebra 
(Gauss), 145 
of arithmetic, 381 
of the integral calculus, 142, 423 


INDEX 


Galois group, 332 
Galois resolvent, 330 
Game theory, 368-372 
Gamma function, 168-171 
Gauss divergence theorem, 75 
Gauss fundamental theorem of algebra, 
145 
Gaussian distribution, 353 
General solution of differential equations, 
182, 191 
Generating function, 234, 476 
Geodesic coordinates, 104 
Geodesics, in a Riemannian space, 95 
of a sphere, 289 
Gradient of a scalar, 55 
Greatest common divisor, 323 
Green’s theorem, 77 
Group, Abelian, 300 
automorphism of, 304 
inner, 305 
central of, 307 
character of, 312 
conjugate elements of, 306 
conjugate subgroups of, 306 
continuous transformation, 314-315 
cyclic, 300 
definition of, 298 
factor, 307 
index of, 308 
finite, 300 
Galois, 332 
octic, 312 
order of, 301 
quotient, 307 
index of, 308 
regular permutation, 303 
representation of, 313 
irreducible, 314 
reducible, 314 
simple, 307, 310 
subgroup of, 301 
invariant, 306 
normal, 306 
maximal, 309 
symmetric, 302 
Groups, intersection of, 307 
isomorphism of, 303 
prime-factor, 311 
product of, 308 


Hamilton-Cayley theorem, 35 
Hamiltonian, 113 

Hamilton’s equations of motion, 113 
Harmonic series, 440 

Heat flow, 227 

Heine-Borel theorem, 392-393 


485 


Helix, 53 
Hermite polynomials, 233, 239, 241 
generating function for, 234 
Hermitian matrix, 34 
Hessian, 419 
Hilbert space, 391 
Homogeneous function, 177 
Homogeneous linear equation, 17 
Hyperbolic cosine, 150, 158 
Hyperbolic sine, 150 
Hypergeometric equation, 221 
Hypergeometric function, 222 
of Kummer, 223 
Hypersurface, 95 


Identity theorem, 151 
Impedance, 270 
Implicit differentiation, 403 
Implicit function theorems, 412-419 
Improper integral, 282, 427 
Cauchy value of, 282 
Independence, linear, 187 
Independent events, 347-348 
Index of factor group, 308 
Indicial equation, 210 
Inertia, moment of, 117 
tensor, 116 
Infemum, 383 
Infinite integral, 428 
uniform convergence of, 429 
Infinite series, of constants, 439 
convergence of, absolute, 444, 455 
conditional, 444 
test for, 440—444 
uniform, 450-454 
expansion in, 246, 252, 461, 465 
of functions, 449 
theorems concerning, 447—450 
of trigonometric functions, 252 
Infinitesimal, 397 
Infinitesimal transformation, 315 
Infinity, point at, 158, 218 
Information theory, ‘“‘bits’’ of informa- 
tion, 346 
Integers, commutative law of, 375 
positive, 374-377 
properties of, 380-381 
rational, 378-379 
Integrable-square functions, 246, 257 
Integral, Darboux, 419 
definite (see Definite integral) 
derivative of, 142 
elliptic, 436 
equation, 203 
formula of Cauchy, 143 
Fourier, 265 


486 


Integral, improper, 427 

infinite, 428, 429 

Lebesgue, 478-480 

line, 66 

mean-value theorem, 422, 457 

Riemann, 131, 419 

Stieltje, 280 

test for convergence, 441 

theorem of Cauchy, 136 
Integral equation of Volterra, 203 
Integrals with a parameter, 142, 425-426 
Integrating factors, 179 
Integration, of a complex variable, 131- 

136 

contour, 160-162 

of Fourier series, 264 

of Laplace’s equation, 78 

methods of, 431-436 

numerical, 474-477 

by parts, 148, 424 

of a real variable, 419-420 

of series, 450, 458 

term-by-term, 448, 458 
Interior point, 387 
Interlacing of zeros, 199, 244 
Interpolation formula of Lagrange, 471- 

472 

Intersection of sets, 388 
Interval, closed, 386 

of convergence, 386, 453 

open, 386 
Invariant, 320 

subgroup, 306 
Irreducible polynomial, 324 
Irrotational vector field, 69, 73 
Isomorphism, 385 

of groups, 303 
Isoperimetric problem, 297 


Jacobian, 62 é 
Jordan curve, 131 
Jordan-Hdlder theorem, 310-312 


Kepler’s first law of planetary motion, 52 

Kinetic energy, 111 

Kronecker, 324 

Kronecker delta, 3, 9 

Kryloff-Bogoluiboff, method of, 271 

Kummer’s confluent hypergeometric 
function, 223 


Lagrange’s equations of motion, 110 
Lagrange’s interpolation polynomial, 
471-472 


ELEMENTS OF PURE AND APPLIED MATHEMATICS 


Lagrange’s method of multipliers, 296, 467 
Lagrangian, 111 
Laguerre polynomials, 224, 239, 241 
associated, 224 
ordinary, 226 
Laplace transformation, 282 
inversion theorem of, 287 
table of, 285 
Laplace’s equation, 76, 226 
integration of, 78 
solution of, 226 
Laplacian, 59, 103 
in cylindrical coordinates, 65 
in orthogonal coordinate system, 65 
in spherical coordinates, 103 
in tensor form, 103 
Laurent expansion, 154 
Law of the mean, for differential calcu- 
lus, 405 
for integral calculus, 422, 424, 457 
Lebesgue integral, 478-480 
Legendre polynomial, 230, 239, 404 
associated, 230 
Legendre’s duplication formula, 169 
Legendre’s equation, 230, 404 
Leibniz, rule of, 403 
Length of arc, 52, 95 
L’ Hospital’s rule, 405 
Limit, 397-398 
Limit cycle, 273 
Limit point, 386 
Line element, 93 
Line integral, 66 
Linear dependence, 20, 187 
Linear differential equations, with con- 
stant coefficients, 191 
definition of, 175 
exact, 178 
existence theorem of, 183 
first-order, 175, 177 
general solution of, 182, 191 
second-order, 199, 202 
indicial equation, 210 
ordinary point, 202 
properties, 199-201 
regular singular point, 210 
systems of, 190 
Linear equations, 2 
homogeneous, 17 
Linear independence, 187 
Liouville’s theorem, 145 
Lipschitz criterion, 183 
Ln z, 148 


M test of Weierstrass, 451 
Maclaurin series or expansion, 462 


Matrices, 11 
comparable, 12 
equality of, 11 
multiplication of, 12-13 
sum of, 12 
Matrix, characteristic equation of, 24 
column, 20 
derivative of, 13 
determinant of, 5, 13 
Hermitian, 34 
identity, 14 
inverse of, 16, 29-30 
multiplication by a scalar, 12 
negative of, 12 
orthogonal, 22 
skew-symmetric, 14 
spur of, 15 
square, 11 
nonsingular, 16 
subdivision of, 32 
symmetric, 14 
trace, 15 
transpose of, 14 
triangular, 28 
unitary, 22 
zero, 12 
Maximal normal subgroup, 309 
Maximum-modulus theorem, 151 
Mean, law of, 405, 422, 424, 457 
Mean or expected value, 405 
Memory function, 270 
Methods of integration, 431-436 
Metric tensor, 93 
Minimal property of partial Fourier 
series, 248 
Mixed strategy, 370 
Moment, of a distribution, 352 
force, 116 
of inertia, 117 
Momentum, angular, 115 
generalized, 113 
Monodromy theorem, 153 
Monte Carlo methods, 364-368 
Morera’s theorem, 145 
Motion, equations of (see Equations) 
in a plane, 52 
of rigid body, 46—47 
Multipliers of Lagrange, 296, 467 
Mutually exclusive events, 343 


Navier-Stokes equation of motion, 117 
Neighborhood, deleted, 387 

spherical, 391 
Nested sets, theorem of, 392 
Newton’s binomial expansion, 339 
Nonessential singular point, 157 ° 


INDEX 487 


Nonlinear differential equations, 271 _ 
Normal to a surface, 56, 71 | 
Normal acceleration, 52 
Normal coordinates, 32 
Normal derivative, 56 
Normal distribution of Gauss, 353 
Normal extension of a field, 330 
Normal subgroup, 306 
maximal, 309 
Normalizing factors, 240 
Null set, 389 
Numbers, algebraic, 396 
Bernoulli, 475-477 
cardinal, 394 
complex, 122-124 
conjugate, 325 
primitive, 328 
real, 382 
triples, 81 
Numerical integration, 474-477 
Numerical methods, 469-478 


Octic group, 312 
Odd function, 256 
One-to-one correspondence, 394 
Open interval, 386 
Open set, 387 
Operator, of Boole, 221 

of continuation, 154, 207 

differential, 192 

vector, 57, 60 
Ordered set, 377 
Ordering theorems, 377 
Ordinary differential equation, 172 
Ordinary point, 202 
Orthogonal curvilinear coordinates, 62 
Orthogonal functions, 237 
Orthogonal polynomials (see Poly- 
nomials) 

Orthogonal trajectories, 176 
Orthogonal transformations, 22 
Orthogonal vectors, 82 
Orthonormal set, 241 


Paperitz, 221 

Paraboloidal coordinates, 95 

Parallel displacement, 108 

Parallel vectors, 38, 108-109 

Parallelogram law of addition, 38, 123 

Parameters, integrals containing, 425 
variation of, 197 

Parseval’s identity, 265 

Partial derivative, 408 

Partial differential equations, Fourier’s 

method of solving, 226-228 


ARS ELEMENTS OF PURE AND APPLIED ‘MATHEMATICS 


Partial differential equations, heat equa- 


tion, 368 
Laplace’s equation, 226 
wave equation, 172, 286 
Partial differentiation, 408 
Partial fractions, 481-432 
Partial sums, 245, 258 
Particular solutions, 182, 194-198 
Peano’s postulates, 374 
Pearson, K., x? distribution of, 363 
Permutations, 337-338 
Picard’s method of successive approxi- 
mations, 184, 203 
Picard’s theorem, 157 
Piecewise continuous function, 258 
Point, boundary, 387 
branch, 149 
exterior, 387 
at infinity, 158, 218 
interior, 387 
limit, 386 
neighborhood of, 387 
deleted, 387 
ordinary, 202 
set theory, 386-389 
singular, 156, 157 | 
of differential equation, 207 a 
essential, 157 
nonessential, 210 
regular, 210 
removable, 157 
Poker hands, 338 
Poles, simple, 157 
Polynomials, 322 
Bernoulli, 475-477 
Hermite, 233, 234, 289, 241 
irreducible, 324 
Lagrange’s interpolation, 471-472 
Laguerre, 224, 226, 239, 241 
Legendre, 230, 239, 404 
normal, 330 
orthogonal, 237 
closed sets of, 249 
complete sets of, 249 
completeness of, 250 
difference equation for, 243 
expansions in, 246, 251 
zeros of, 242 
relatively prime, 324 
separable, 328 
Tchebysheff, 241 
Positive-definite quadratic form, 28 
Positive integers, 374-377 
Potential, electrostatic, 76, 165 
Potential function, 111 
Power series, 453 
convergence of, 453 


Power series, convergence of, radius of, 
4 


uniform, 453-454 
differentiation of, 454 
expansions in, 462 
functions defined by, 151 
integration of, 455 
solutions of differential equations by, 

206, 212-217 

Pressure, 119 
Prime factors of composition, 311 
Primitive number, 328 
Principal directions, 118 
Principal equation, 326 
Probability, a posteriori, 345 

a priori, 345 

axiomatic definition of, 343 

central-limit theorem of, 359-361 

conditional, 345 

continuous, 351 

field of, 343 

function of, 352 

distribution, 351 
Gauss, distribution of, 353 
joint, 355 

Pursuit problem, 54 


Quadratic forms, 21 
positive-definite, 28 
Quotient group, 307 
index of, 308 
Quotient law of tensors, 90 


Raabe’s test, 442 
Radius, of convergence, 146, 453 
of curvature, 53 

Random variable, 351 

Random-walk problem, 350, 367 

Ratio test of D’Alembert, 441 

Rational function, 434 

Rational integers, 378-379 

Rational numbers, 379-380 

Real-number system, 382-385 

Recapitulation of vector differential for 
mulas, 61 

Reciprocal tensors, 92, 94 

Rectifiable path, 277 

Recursion formula, 202 

Regular function, 130 

Regular permutation group, 303 

Regular singular point, 210 

Relative tensors, 89 

Relative velocity, 54 

Remainder in Taylor series expansion, 
462 


INDEX 


Residue, 158, 160 
Ricci tensor, 106 
Riemann integral, complex-variable 
theory, 131 

real-variable theory, 419-424 
Riemann surface, 149 
Riemann-Christoffel tensor, 106 
Riemannian coordinates, 105 
Riemannian curvature tensor, 106 
Riemannian metric, 93 
Riemannian space, 93 

geodesics in, 95 
Rigid body, motion of, 46-47 
Ring, 379 
Rolle’s theorem, 404 
Rule, Cramer’s, 10 

Simpson’s, 473-474 
Russell number, 382-385 


Saddle element, 369 
Scalar, 37 

gradient of, 55 

Laplacian of, 59, 103 
Scalar curvature, 106 
Scalar product of vectors, 42, 88 

triple, 47 
Schwarz-Cauchy inequality, 391 
Schwarz-Christoffel transformation, 163 
Second law of the mean, 457 
Second-order differential equations, 199, 

202 

properties of, 199-201 
Sectionally continuous function, 258 
Separation of variables, method of, 227 
Sequence, 437 

convergence of, Cauchy criterion for, 

437 
convergent, 437 
uniformly, 446 

Sequential approach, 391 
Series, 439 

alternating, 443 

comparison test for, 441 

of composition, 311 

convergent, 439 

differentiation of, 450 

divergent, 439 

Fourier (see Fourier series) 

harmonic, 440 

infinite (see Infinite series) 

integration of, 450, 458 

Maclaurin, 462 

oscillating, 439 

power (see Power series) 

sum of, 439 


489 


Series, Taylor’s (see Taylor’s series or 
expansion) 

uniformly convergent, 450 
Set, boundary of, 389 

bounded, 383, 386 

Cantor, 396 

closed, 386, 388 

closure of, 389 

complement of, 387 

countable, 394 

denumerable, 394 

diameter of, 390 

infemum of, 383 

limit point of, 386, 389 

nondenumerable, 394 

null, 389 

open, 387 

ordered, 377 

orthonormal, 241 

supremum of, 383 

totally ordered, 377 

uncountable, 394 
Sets, Cantor’s theory of, 394-396 

field of, 342 

intersection of, 388 

nested, theorem of, 392 

uni of, 388 
Simple closed curve, 131 
Simple group, 307, 310 
Simple mappings, 124 
Simpson’s rule, 473-474 
Simultaneous differential equations, 190 
Sine z, 131 
Singular points (see Point) 
Singularities, confluence of, 223 
Sinh z, 150 
Slope, 402 
Space curve, 52 

arc length of, 52 

curvature of, 53 

Jordan, 131 

tangent to, 52 

torsion of, 53 

unit binormal of, 53 

unit principal normal of, 53 
Spherical coordinates, 63 
Steady-state vector field, 41 
Stieltje’s integral, 280 
Stirling’s formula, 170, 477-478 
Stochastic variable, 351 
Stokes’s theorem, 70 
Strain tensor, 118 
Stress tensor, 118 
Sturm-Liouville equation, 200 
Subgroups, conjugate, 306 

invariant, 306 

normal, 306 


4‘4 ELE 
Su} oh Be? » vosf t 
Sub vact' -1 Jsors, 38 


Sw ola 439 
S aptics, oon ation, 1 
‘ erscripis, 1 

._premum, 383 

urface, 70 

normal to, 56, 71 
Symmetric functions, 320 

fundamental theorem of, 321 
Symmetric group, 302 
System of equations, 4 


Tan! z, 150 
Tangent to a space curve, 52, 84 
Tauber, example of, 455 
Taylor’s series or expansion, 145, 461- 
465 
for functions of n variables, 464—465 
remainder in, 462 
uniqueness of, 147 
Tchebysheff’s theorem, 354 
Tensor, components, 89 
contraction, 90 
contravariant, 89 
covariant, 89 
curvature, 89 
density, 89 
inertia, 116 
isotropic, 91 
metric, 93 
mixed, 89 
Ricci, 106 
Riemann-Christoffel, 106 
strain, 118 
stress, 118 
weight of, 89 
Tensors, absolute, 89 
addition of, 89 
product of, 89 
quotient law of, 90 
reciprocal, 92, 94 
relative, 89 
Termwise differentiation and integration, 
448-450, 458 
Theorem of the mean, differential cal- 
culus, 405 
integral calculus, 422, 423, 457 
Theory of games, 368-372 
Torque, 116 
Torsion of a space curve, 53 
Total differential, 409 
Totally ordered set, 377 
Trajectories, orthogonal, 176 
Transfer function, 270 


{TS OF PURE AND APPLIED MATHEMATICS 


Transform, Fourier, 268 
Laplace, 282 
Transformations, affine, 313 
bilinear, 125 
continuous group, 314-315 
infinitesimal, 315 
Laplace (see Laplace transformati 
orthogonal, 22 
Schwarz-Christoffel, 163 
Triangular matrix, 28 
Trigonometric series, 252 
Trihedral, 41, 53, 63 
Triple scalar product, 47 
Triple vector product, 48 


Uncountable set, 394 
Undetermined coefficients, method of 
194 

Unicursal curve, 435 

Uniform continuity, 399 

Uniform convergence, Abel’s test for, 
of a sequence, 446 
of a series, 450 
Weierstrass M test for, 451 

Union of two sets, 388 

Unitary matrix, 22 


Vandermonde determinant, 334 
Van der Pol’s equation, 273 
Variation, bounded, 386 
calculus of (see Calculus of variatio1 
of parameters, 197 
Vector, 37 
associated, 94 
binormal, 53 
characteristic, 24 
column, 20 
components of, 41, 81 
physical, 87 
conservative, 67 
contravariant, 84 
curl of, 59, 83, 102 
derivative of, 50 
divergence of, 58, 102 
irrotational, 69, 73 
length of, 37, 94 
normal, 53 
operator del (V), 57 
parallel displacement of, 108 
potential, 77 
solenoidal, 77 
space, 41 
tangent, 52 
unit, 37, 81 
zero, 37 


“tor field (see Field) 

. tors, addition of, 38 

ngle between, 42 
differentiation of, 50, 61 
“quality of, 37 

linear combination of, 39 
orthogonal, 82 

parallel, 38, 108-109 

scalar or dot product of, 42, 88 
subtraction of, 38 

triple scalar product of, 47 
triple vector product of, 48 
unit, fundamental, 41 

vector or cross product of, 45 
elocity, 50 

angular, 46 


INDEX LO} 


Velocity, relative, 


Volterra integral ag ate eal 


re. 
Wave equation, 172, . ee 
Weierstrass approximat th dt’ 250 


Weierstrass M test, 451 Mi" 
Weierstrass-Bolzane theorem, 389" * 
Work, 112 i 


Wronskian, 187 


Zermelo postulate, 392 
Zero, Aleph, 394 
Zero matrix, 12 
Zero sum game, 368 
Zeros, interlacing of, 199, 244 
of orthogonal polynomials, 242 


